In what format is SQL Server data serialized when it is sent through the network?
The reason I am asking this question is because we are planning to read A LOT (several GB's) of data from a SQL Server database to a .Net app for processing. I would like to know how much space overhead to calculate for each record for estimating the impact on our network traffic.
E.g. a record consists of 5 integers (which makes 4 * 5 = 20 byt开发者_如何学JAVAes of data). How many bytes are physically transferred per record? Is there a precise formula or a rule of thumb?
SQL Server uses the TDS protocol. And MSDN
Frankly, I wouldn't worry about it. GBs of data will take time no matter how it's done unfortunately.
I don't have a clue about the actual format, but I would suggest an empirical approach and hook up Wireshark and measure the data yourself.
As Peter M said, test it.
There isn't a real enough calculation you can perform which will give you enough information to work off of.
The reality is that there are too many variables to consider. For example:
What actual rate do the NIC's involved transfer at? Note that this rate will be different based on which network cards are in place as well as which DRIVERS those cards are using. You could quite easily have a 1Gb card which can only transfer at, say, 300Mb due to driver issues. I've even seen two cards by the exact same manufacturer with the same drivers have different transfer speeds due to a slight configuration difference in one of the cards.
What other pieces of equipment are between the two machines in question? Again, depending on the hardware, os's etc, you may see wildly different numbers. A $100 8 port 1Gb unmanaged switch from TRENDNet is going to have completely different throughput than a $5000 1Gb cisco managed switch.
You will also have to consider the existing network "weather" at the time of the transfer What is the throughput from OTHER network traffic over the same lines that this will share. This will be a transient factor as the existing network load changes as different demands are placed on it.
Additionally, some nic's support TCP Offloading, others don't. If your nic's don't then effective transfer rate is going to be hampered by whatever else the CPUs on those boxes are doing.
Next, hard drives have to be taken into consideration. Considering this is a large amount of data, then the read and write speeds of the various hard drives will have an impact. Sure the network might actually be able to run at 90% efficiency, but if you are talking large amounts of data, the hard drives themselves might not be able to keep up and therefore cause that to drop down to 25% efficiency, or less.
Point is, you have to test it and at the end of the day, the protocol that SQL server uses will be immaterial to your findings. And don't run just one test, run a lot of real world tests. Only then will you be able to come up with an average; which might still be off depending on whatever else is going on at the time, but you should be able to get within say 10%.
From my observations, standard SQL commands cause a lot of round-trips. So for transferring lots of data it helps if you can restate it as uploading one table. Then you can use bulk copy operation, which is much more efficient. See: Bulk Copy Operations in SQL Server (ADO.NET) and bcp Utility.
Actually TDS protocol is an extreamly slow protocol. SQL Server is optimized for processign data, not for marshaling back and forth tonnes of data. While the representation overhead is not large, the fact that is a request-response protocol and the lack of boxcaring makes it quite slow compared to dedicated high throughput protocol, even inside SQL Server (like the Database Mirroring or the Service Broker protocols). But even so, with TDS being as slow as it is, a SQL Server shooting at full speed through a TDS pipe will overwhealm your .Net client, guaranteed.
Overall, if you ever come to ask a question like the one you asked, it means you're doing it wrong.
精彩评论