Is there a way to not break code if columns in a database change?
Assume I have a declared a datatable and to this datatable I have assigned a result that gets returned from calling a stored procedure, so now, my datatable contains something like the following when acc开发者_开发知识库essing a row from it:
string name = dr["firstname"];
int age = (int)dr["age"];
if firstname is changed to first_name and age is removed, the code will obviously break because now the schema is broken, so is there a way to always keep the schema in sync with the code automatically without manually doing it? Is there some sort of meta description file that describes the columns in the database table and updates them accordingly? Is this a case where LINQ can be helpful because of its strongly typed nature?
What about good old fashioned views that select by column name, they always output the columns with the specified names in the specified order. If the table underneath needs to change, the view is modified if necessary but still outputs the same as it did before the underlying table change - just like interfaces for your objects. The application references the views instead of the tables and carries on working as normal. This comes down to standard database application design which should be taught in any (even basic) data architect course - but I rarely see these actually used in business applications. In fact, the project I'm currently working on is the first where I've seen this approach taken and it's refreshing to actually see it used properly.
Use stored procs, if your table changes, modify the stored proc so the output is still the same - used in a similar manner to shield the application from the underlying table thus insulating the application from any table changes. Not sufficient if you're looking to do dynamic joins, filters and aggregates where a view would be more appropriate.
If you want to do it application side, specify the names of the fields you're querying right in the query rather than using "select *" and relying on the field names to exist. However, if the field names on the table change, or a column is deleted, you're still stuck, you've gotta modify your query.
If the names of fields will change, but all of the fields will always exist, the content of those fields will remain the same and the fields will remain in the same order, you could reference the fields by index instead of by name.
Use an object relational mapper as others have specified, but I don't think this necessarily teaches good design rather than hopes the design of the framework is good enough and appropriate for what you're doing, which may or may not be the case. I'm not really of the opinion this is a good approach though.
About the only way to prevent this is through the use of Stored Procedures which select the columns and rename them to a standard name that is returned to your application. However, this does add another layer of maintenance to the database.
This was the reason ORM solutions such as NHibernate were created.
That or a code generator based on the database schema.
Why would you not want to change the code? If age is removed why would you want to still attempt to grab it in your code?
What Linq does is try to keep all the business logic in one location, the source code, rather than splitting between Database and Source Code.
You should change the code when the data columns are removed.
As you can perceive from all the answers given, what you are looking for doesn't exist. The reason for this is that you should remember programs are essentially data processing routines, so you can't change your data without changing something else in the program. What if it isn't the name of the column but it's type that's changing? Or what would happen if the column was deleted?
In sum, there's no good solution for such problems. Data is an integral part of the application - if it changes, expect at least some work. However, if you expect names to change (the database isn't yours, for example, and you have been informed by the owner that it's name might change in the future), and you don't want to re-deploy the application because of that, alternatives to recompiling your source code which, as stated in the other answers, include:
- Use Stored Procedures
- You can use stored procedures to provide data to the application. In the case of the proposed change (renaming a column), the DBA or however was in charge of the database schema should change the stored procedure as well.
- Pros: No need for recompilation due to minor changes in the database
- Cons: More artifacts that become now part of the application design, application understanding is blurred.
- Use a Mapping File
- You can create a mapping file that gives you the name that your application expects a certain column to have and the actual name the column has. Such are very inexpensive and easy.
- Pros: No need for recompilation due to minor changes in the database
- Cons: Extra entity (class) in your design, application understanding is blurred, you need to re-deploy the mapping file on change.
- Use column position instead of column name
- Instead of referencing the name of the column, use a positional argument (dr[1]).
- Pros: Keeps you safe from name changes.
- Cons: Everything else. If you table changes to accommodate more data (new column) there's a chance the numbering of columns will also change, if any of the columns is deleted you also will have a numbering problem, etc.
But a suggestion. Instead of accessing the column direct through a literal, use constants with some good naming standard. So
string name = dr["firstname"];
int age = (int)dr["age"];
Becomes
private const string CUSTOMER_COLUMN_FIRST_NAME = "firstname"
private const string CUSTOMER_COLUMN_AGE = "AGE"
string name = dr[CUSTOMER_COLUMN_FIRST_NAME];
int age = (int)dr[CUSTOMER_COLUMN_AGE];
This doesn't solves your problem, but it enables you to add better meaning to the code (even if you decide to abbreviate the constant's name) and make changing the name more easily, since it's centralized. And, if you want, Visual Studio can generate a class (inherited from DataTable) that statically defines your database rows, which also make code semantics more clear.
Apparently you have to introduce another layer of abstraction between your database and your applicatoin. Yes, this layer can be Linq2Sql, Entity Framework, NHibernate or any other ORM (object relation mapping) framework.
Now about that 'automatically'... maybe this kind of small change (renaming a column) can be handled automatically by some tools/framework. But I don't think that any framework can guarantee proper handling of changes automatically. It many cases you will have to manually do the "mapping" between your database and that new layer, so that you can keep the rest of your application unaffected.
Yes, Use Stored procedures for all access, and alias the actual attribute names in the table for output to the client code... Then if actual column names in the table change, you just have to change the sql in the stored proc, and leave the aliases the same as they were, and the client code can stay the same
Is there a way to not break code if columns in a database change?
No
It is a very, very good thing that you can't do this (completely) automatically.
If the database changes such that an application feature is no longer valid, you don't want the application to continue to expose the feature if the database no longer supports the feature.
In the best case, you want database changes to cause your code to no longer compile, so you can catch the problems at compile time rather than run time. Linq will help you catch these kinds of issues at compile time and there are many other ways to increase the agility of your code base such that database changes can be somewhat quickly propagated through the entire code base. Yes, ORMs can help with this. While views and stored procedures may make the problem better, they may also make it worse by increasing the complexity and amount of code that needs to react to changes to columns in tables.
Using code generation of some sort to generate (at least some part of) your data layer is your best bet to getting compile time errors when your application and database get out of sync. You should probably also have unit tests around your data layer to detect as many run-time type inconsistencies as possible when it's difficult to find the errors at compile time (for example, things like size constraints on columns).
It won't help when "age" is removed, but if you know that the columns will always be returned in the same order - even if the names change, then you could reference them by column name, instead, like:
string name = dr[0];
int age = (int)dr[1];
Depending on your DB version, you could also check out a Data Access generator such as SubSonic.
精彩评论