SQL Server performance with a large number of tables in database
I am updating a piece of legacy code in one of our web apps. The app allows the user to upload a spreadsheet, 开发者_如何学编程which we will process as a background job.
Each of these user uploads creates a new table to store the spreadsheet data, so the number of tables in my SQL Server 2000 database will grow quickly - thousands of tables in the near term. I'm worried that this might not be something that SQL Server is optimized for.
It would be easiest to leave this mechanism as-is, but I don't want to leave a time-bomb that is going to blow up later. Better to fix it now if it needs fixing (the obvious alternative is one large table with a key associating records with user batches).
Is this architecture likely to create a performance problem as the number of tables grows? And if so, could the problem be mitigated by upgrading to a later version of SQL Server ?
Edit: Some more information in response to questions:
- Each of these tables has the same schema. There is no reason that it couldn't have been implemented as one large table; it just wasn't.
- Deleting old tables is also an option. They might be needed for a month or two, no longer than that.
Having many tables is not an issue for the engine. The catalog metadata is optimized for very large sizes. There are also some advantages on having each user own its table, like ability to have separate security ACLs per table, separate table statistics for each user content and not least improve query performance for the 'accidental' table scan.
What is a problem though is maintenance. If you leave this in place you must absolutely set up task for automated maintenance, you cannot let this as a manual task for your admins.
I think this is definitely a problem that will be a pain later. Why would you need to create a new table every time? Unless there is a really good reason to do so, I would not do it.
The best way would be to simply create an ID and associate all uploaded data with an ID, all in the same table. This will require some work on your part, but it's much safer and more manageable to boot.
Having all of these tables isn't ideal for any database. After the upload, does the web app use the newly created table? Maybe it gives some feedback to the user on what was uploaded?
Does your application utilize all of these tables for any reporting etc? You mentioned keeping them around for a few months - not sure why. If not move the contents to a central table and drop the individual table.
Once the backend is taken care of, recode the website to save uploads to a central table. You may need two tables. An UploadHeader table to track the upload batch: who uploaded, when, etc. and link to a detail table with the individual records from the excel upload.
I will suggest you to store these data in a single table. At the server side you can create a console from where user/operator could manually start the task of freeing up the table entries. You can ask them for range of dates whose data is no longer needed and the same will be deleted from the db.
You can take a step ahead and set a database trigger to wipe the entries/records after a specified time period. You can again add the UI from where the User/Operator/Admin could set these data validity limit
Thus you could create the system such that the junk data will be auto deleted after specified time which could again be set by the Admin, as well as provide them with a console using which they can manually delete additional unwanted data.
精彩评论