Is there a SQL Server function for displaying pseudo-random database records?

2023-01-06 11:24 问答作者：

is there a way to sh开发者_StackOverflowow a certain amount of random records from a database table, but heavily influenced by the date and time of creation.

for example:

showing 10 records at random, but
showing latest with more frequency than the earliest
say there are 100 entries in the news table
latest (by date time) record would have an almost 100% chance of being selected
first (by date time) record would have an almost 0% chance of being selected
50th (by date time) record would have a 50% chance of being selected

is there such a thing in mssql directly? or is there some function (best practice) in c# i can use for this?

thnx

** edit: the title is really horrible, i know. please edit if you have a more descriptive one. thnx

A quite simplistic way might be something like the following. Or at least it might give you a basis to start with.

WITH N AS
(
SELECT id,
       headline,
       created_date,
       POWER(ROW_NUMBER() OVER (ORDER BY created_date ASC),2) * /*row number squared*/
          ABS(CAST(CAST(NEWID() AS VARBINARY) AS INT)) AS [Weight] /*Random Number*/
  FROM news
  )
  SELECT TOP 10 
       id,
       headline,
       created_date FROM N 
  ORDER BY [Weight] DESC

For a random sample, see Limiting Result Sets by Using TABLESAMPLE. Eg. Select a sample of 100 rows from a table:

SELECT FirstName, LastName
FROM Person.Person 
TABLESAMPLE (100 ROWS);

For weighted sample, with preference to most recent records (I missed this when first reading the question), then Martin's solution is better.

You can pick an N using exponential distribution (for instance), and than SELECT TOP(N) ordered by date, and choose the last row. You can choose the exponent according to the number of existing rows.

Unfortunately I don't know MSSQL, but I can give a high-level suggestion.

Get the date in UNIX time (or some other increasing integral representation)
Divide this value by the max for each column to get a percentage.
Get a random number and multiply by the percentage above.
Sort your columns by this value and take the top N.

This will give more weight to the most recent results. If you want to adjust the relative frequency of older vs later results, you can apply an exponential or logarithmic function to the values before taking the ratio. If you're interested, let me know and I can provide more info.

If you can filter results after DB access, or you can submit a query with order by and process results with a reader, then you can add a probabilistic bias to the selection. You see that the higher the bias, the harder the test inside the if, the more random the process.

var table = ...  // This is ordered with latest records first
int nItems = 10;  // Number of items you want
double bias = 0.5;  // The probabilistic bias: 0=deterministic (top nItems), 1=totally random
Random rand = new Random();

var results = new List<DataRow>();  // For example...

for(int i=0; i<table.Rows.Count && results.Count < nItems; i++) {
    if(rand.NextDouble() > bias)
        // Pick the current item probabilistically
        results.Add(table.Rows[i]);  // Or reader.Next()[...]
}

继续阅读：random sql-server visual-studio

Is there a SQL Server function for displaying pseudo-random database records?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？