comparing two string arraylists or lists
I have two SQL tables with data that I would like to compare. The tables have the following structures:
id, title,url
I want to compare the tables by title where if the strings are similar or equal it would be noted by w开发者_StackOverflowriting the records to a file. Someone suggested not doing it in SQL and copying the data and using it in .net as lists to compare.
Any suggestions on how to do this?
a typical example for title:
Table 1 - KPTH Fox 44 via
Table 2 - KPTH Fox44
I tried using LIKE in sql with wildcard '%' but his only works for these type of examples and wont work on
ABC
MSabC
This would be way easier to do it in the DB and then return the dataset of items that match. Assuming the two tables are identical and you are matching on just title your SQL would look like:
SELECT [id], [title], [url]
FROM [Table1] t1
JOIN [Table2] t2 ON t1.[title] = t2.[title]
Then iterate through each record in the result set and write whatever fields you want to a file.
If you want to just stick to SQL, you could run the query and then just save the outout to a file using a query tool for your DB like Management Studio
or something.
EDIT: Based on the 'similar' requirement you could use the SOUNDEX
command in SQL to compare if they sound similar. See MSDN for more info regarding the SOUNDEX
command.
EDIT 2: To add more information with respect to you added requirements: In SQL this similar requirment might be difficult. You could parse the title into words and then store them into a temp table and see how many match up and then based on a threshhold to assign a match condition. It looks like your data also stores a URL so you could possibly use this as well and check to the base URL to see if they match first before doing the weighted title check. I am not totally clear what outcome you are looking for other than the title.
It's really not a very simple task that I think someone will be able to just give you an answer. I am sure a prof at a university is working on this problem though.
精彩评论