Solved: Is TCP really reliable?

speculor_cisco · ‎12-27-2010

I was reading many informations about checksum and CRC.

I am a little surprised now because we always read that TCP is reliable, but this is not so true.

Do you know any resources where this problem, concerning TCP in particular, is considered?

Thanks.

paolo bevilacqua · ‎12-28-2010

These are mostly idle and academic discussions.

Every layer 2 protocol also has an error detection mechanism if not an forward error correction or retransmission one for some media types.

This summed to layer 3 checksum, makes IP/TCP 100% reliable, for all and any practical purpose, including financial, medical, military, scientific and high performance computing. In other words, it just just works perfectly fine.

View solution in original post

Peter Paluch · ‎12-28-2010

Hello,

Hey, no need to apologize - we are simply discussing things from both their theoretical and practical perspective. There's nothing wrong with that.

But once and for all: is it true or false that with those sensitive 
applications the system needs error protection also at the application 
layer?

This question cannot be answered in general for there are simply too many variables involved. This is the classical case where the theory tells us that there is a minute possibility, almost infinitesimal but not equal to zero, that a transmission error can get undetected. However, such an event is so improbable that for any realistic practical implementation, it can be regarded as an impossible event (the probability theory calls it a practically impossible event).

In physics, there is a nice example of two interconnected closed containers filled with a gas. We know that gases behave according to probabilistic properties. Now, there is a possibility that all molecules of that gas would migrate to one of those containers, leaving the other one in the state of absolute vacuum. But that possibility is infinitesimally small, so such an event is practically impossible. The situation with transmission error eluding the layered error detection is similar.

Applications keenly sensitive to the integrity of transported data may consider using an additional layer of protection. After all, just have a look how CD/DVD images are distributed (e.g., Knoppix images): along with the image, you can also download the MD5 or SHA hash to verify the correctness of the image after download - this is also way of additionally checking the integrity strictly in the userspace, on the application layer. There are lots of available mechanisms to do just that. The fact is that the simple real life experience shows us that undetected transmission errors are practically nonexistent, so implementing an additional layer of checking may simply not be worth the effort. This is simply the fact we all have to live with.

Best regards,

Peter

View solution in original post

paolo bevilacqua · ‎12-28-2010

speculor_cisco wrote:

Ok, it is my problem.

But once and for all: is it true or false that with those sensitive applications the system needs error protection also at the application layer?

Yes, they may do. But not much to protect against a trasmission error, more realistically to protect against software bugs, memory corruption, human tampering, and any kind of unforeseen event that can be harmful.

Thanks for the nice rating and good luck!

View solution in original post

Reza Sharifi · ‎12-27-2010

You can start here:

http://www.tcpipguide.com/free/

speculor_cisco · ‎12-27-2010

Thanks for your answer.

I do not think that those texts explain the type of question that I am facing.

I know that there are academic articles about it, but I would like to know if someone

can tell me about some resources where this problem is faced, in particular with TCP, without

having to look for detailed articles and so on.

Peter Paluch · ‎12-27-2010

Hi,

The reliability of TCP has to be interpreted properly, and you've just stumbled across the hard truth The TCP is able to provide in-order delivery and retransmission of lost segments if and only if a transmission error is detected. The ordering of segments enables easy and truly reliable way of understanding whether the segments are arriving all and in order.

The detection of errors induced during transmission is more complicated. The checksum used with TCP is not particularly sensitive to multiple errors. It can indeed happen that a TCP segment may get damaged in such a way that the method to compute the checksum will not detect these. The issue of error-checking and error-correcting codes is actually a large topic under the information and coding theory, nowadays a prominent branch of mathematics.

However, with the particular checksum used with TCP, you can think in these terms: the checksum (a 16-bit value) is used for any single TCP segment size, and the size of a TCP segment is theoretically slightly under 2^32 bits because of 32-bit sequence numbering that gets increased by the TCP payload size. Now, if you consider the TCP payload to be a simple binary string, then the computation of a checksum maps or transforms this string into a checksum of a fixed size. In other words, the checksum here is a function that maps all possible strings of length 0, 1, 2, ..., 2^32-1 onto a set of 16-bit strings. Logically, the set of input strings (all possible payloads of all possible sizes) is much larger than the set of possible checksums, and each input string is assigned a checksum. This means, however, that two or more input strings necessarily and unavoidably share the same output string - the same checksum! So, the collisions of checksums - and thus undetectable errors - are a natural and unavoidable result of the particular way of performing TCP segment checksumming.

The practice shows, however, that the TCP failures to provide reliable data transport service are actually quite rare, despite the relatively simple implementation of the checksum. This checksum is very good at detecting single bit failures, and multiple-bit failures are more probably detected on the Layer2 which uses more complex methods (CRCs) of verifying transmitted data.

But at the bottom line - the TCP is not absolutely reliable.

Best regards,

Peter

speculor_cisco · ‎12-27-2010

Hello Peter and thanks for your answer.

With "Internet checksum" used in IP packet and TCP segment, we have 1 / 65,536 probability that an invalid packet or

an invalid segment will be considered valid.

With "CRC-32" used in Ethernet frame, we could have 1 / 100,000,000,000,000,000,000 probability that an invalid frame will

be condidered valid, under some hypothesis.

This is a little number but not so little in my opinion.

And I am not sure that we can accept this little risk with financial transactions.

Do you know if in these important cases stronger methods are used at application layer?

Even if stronger methods are slower when implemented in software.

.

Peter Paluch · ‎12-27-2010

Hi,

I must stress at the beginning that I have only flying knowledge of the coding theory, and there are probably much better experienced people around who know better that I do so please consider my answers always with a grain of salt.

With "Internet checksum" used in IP packet and TCP segment, we have 1 / 65,536 probability that an invalid packet or an invalid segment will be considered valid.

Hmmm, how did you arrive at this number? It does not seem to be correct to me, at least not if we allow multiple bit errors in a single payload to occur.

Do you know if in these important cases stronger methods are used at application layer?

This depends very strongly on the application in use. For example, some archiving programs (bzip2, for example) create files that are internally divided into blocks and each of these blocks is equipped with its own CRC-32. The IPsec and TLS use Hash Message Authentication Codes or HMACs (based on MD5, SHA and possibly others) to verify integrity. Other proprietary protocols, for example, transactional databases, may have their own methods of verifying transmitted data. This really depends on the application itself.

Best regards,

Peter

speculor_cisco · ‎12-27-2010

Hello Peter.

In the case of the "Internet checksum" I had supposed a number of random corruptions: there is the same probability that

a message has a different number of errors. But in practice this is not true and you are right.

In the case of the "CRC-32" I found that, with message lenght 12,112 bits equal to the lenght of an Ethernet frame, 1 - 2 - 3 errors

are always detected. So the fundamental term is due to 4 errors and in this case 223,059 possible errors on 906x10**12 are not detected.

You can download the 2002 article of Koopman for more informations. In that article the author supposes 1 bit error on 10**6 bits.

With these informations I have computed the second probability in the previous post.

I do not know if the 10**(-6) bit error rate is true today for Ethernet media, but I suppose that the author did a realistic example.

paolo bevilacqua · ‎12-28-2010

These are mostly idle and academic discussions.

Every layer 2 protocol also has an error detection mechanism if not an forward error correction or retransmission one for some media types.

This summed to layer 3 checksum, makes IP/TCP 100% reliable, for all and any practical purpose, including financial, medical, military, scientific and high performance computing. In other words, it just just works perfectly fine.

Peter Paluch · ‎12-28-2010

Hello Paolo,

Agree with mostly everything you wrote. Granted, this is an academic discussion. Then again, even academic discussions are useful for those who are interested in joining them (not to mention an ongoing evolution and development which stands and falls on academic discussions) - you surely agree.

Stating that the combined protection of several layer makes the IP/TCP 100% reliable is incorrect from an exact mathematical standpoint. However, we can say with complete confidence that for practical purposes, the protection is reliable, and that is what we are seeking for after all.

Best regards,

Peter

speculor_cisco · ‎12-28-2010

Ok, it is my problem.

But once and for all: is it true or false that with those sensitive applications the system needs error protection also at the application layer?

From Paolo's answer it seems that the system always needs only layer 2, 3 and 4 error protection.

How many Ethernet frames, for example, are sent everyday in the world?

I do not really know.

May be Paolo says that the system works fine as works fine the second principle of thermodynamics.

Only in this case I would agree.

Sorry for my insistence.

Peter Paluch · ‎12-28-2010

Hello,

Hey, no need to apologize - we are simply discussing things from both their theoretical and practical perspective. There's nothing wrong with that.

But once and for all: is it true or false that with those sensitive 
applications the system needs error protection also at the application 
layer?

This question cannot be answered in general for there are simply too many variables involved. This is the classical case where the theory tells us that there is a minute possibility, almost infinitesimal but not equal to zero, that a transmission error can get undetected. However, such an event is so improbable that for any realistic practical implementation, it can be regarded as an impossible event (the probability theory calls it a practically impossible event).

In physics, there is a nice example of two interconnected closed containers filled with a gas. We know that gases behave according to probabilistic properties. Now, there is a possibility that all molecules of that gas would migrate to one of those containers, leaving the other one in the state of absolute vacuum. But that possibility is infinitesimally small, so such an event is practically impossible. The situation with transmission error eluding the layered error detection is similar.

Applications keenly sensitive to the integrity of transported data may consider using an additional layer of protection. After all, just have a look how CD/DVD images are distributed (e.g., Knoppix images): along with the image, you can also download the MD5 or SHA hash to verify the correctness of the image after download - this is also way of additionally checking the integrity strictly in the userspace, on the application layer. There are lots of available mechanisms to do just that. The fact is that the simple real life experience shows us that undetected transmission errors are practically nonexistent, so implementing an additional layer of checking may simply not be worth the effort. This is simply the fact we all have to live with.

Best regards,

Peter

speculor_cisco · ‎12-28-2010

Ok Peter and thanks for your patience.

I am going to post a "more practical" discussion about VTP and I would like to have your opinion if possible.

Thanks.

paolo bevilacqua · ‎12-28-2010

speculor_cisco wrote:

Ok, it is my problem.

But once and for all: is it true or false that with those sensitive applications the system needs error protection also at the application layer?

Yes, they may do. But not much to protect against a trasmission error, more realistically to protect against software bugs, memory corruption, human tampering, and any kind of unforeseen event that can be harmful.

Thanks for the nice rating and good luck!