Solved: Datagram VS Packet

169.254.X.Y · ‎09-09-2017

I am just assuming that the difference between packets and datagrams are ip protocol types. If the IP protocol is TCP, it is a packet. On the other hands, if the IP protocol is UDP, it is a datagram. If there are more stuffs between packets and datagrams, please let me know :)

Peter Paluch · ‎09-10-2017

Hi,

My understanding is that if data received from the application layer is bigger than MSS(Probably 1460 byte), data gets segmented.

If we are talking about TCP, then this is absolutely correct. However, keep in mind that TCP is allowed to make a smaller segment anytime it wishes to do so - not all TCP segments sent by a particular host in a given connection have to carry 1460 bytes in their payload. There are multiple reasons for that - for example, the congestion window on the sender side was smaller than 1460 bytes, or there were less than 1460 bytes ready to be sent, and TCP did not want to wait too long for additional bytes to be ready, as it would introduce overly large delays.

If the segment received from L4 + IP Header is bigger than MTU(1500 byte), it gets fragmented.

That is also correct. Technically, what gets fragmented is the entire resulting IP packet along with the TCP segment inside.

So, segments are made in L4 and packets are made in L3.

Absolutely correct.

When I learned ICND 1, I was taught that packet and datagram are interchangeably same.

The terminology is somewhat free in this aspect and different authors may suggest different terminology. It has been fairly universally accepted that frame describes an L2 message, packet describes an L3 message, and segment describes an L4 message.

As for datagram, this term has a much wider and less strict meaning. I personally consider the term "datagram" to stand for "message of a certain type". In other words, frame is an L2 datagram, packet is an L3 datagram, segment is an L4 datagram. The "datagram" itself does not have the connotation of any particular layer; instead, it just describes a well formed message originated by one of the layers. Even UDP - User Datagram Protocol - uses the word "datagram" in its name without being exclusive about it. You probably remember the Protocol Data Unit, or PDU. Well, in my personal understanding, the datagram is the same as a PDU.

If A needs to send 9000 byte of data to B and A figured out that B has a window size of 8100 byte through 3 way handshake. A will segment the 9000 byte to 6 1460 byte and 1 240 byte. Since B has a window size of 8100 byte, A will send 5 segments and wait ACK from B. I know that first 5 packets will have different sequence numbers. Will 5 packets' sequence number be B's last ACK +1, +2, +3, +4, +5?

This is a more complicated topic.

Do not confuse window size and a maximum segment size (MSS). A window size indicated in each TCP segment, also called the receiver window (rwnd), tells the other TCP peer how many bytes can it send without waiting for any acknowledgment. It does not matter how those bytes are segmented - whether is one large segment or many smaller ones. The receiver window size simply tells how many bytes can be "in flight" because the receiver has enough memory to store them when they come. This indicated window size keeps changing all the time - if you captured a TCP session when, say, downloading an ISO image, and checked your TCP ACK segments and the window size they indicate, the value there would fluctuate and keep changing. That is because it always reflects the momentary amount of free space that the operating system has available for this particular TCP session to receive in-flight data.

In addition, if A and B knew that their IP MTU is 1500, they would never indicate an MSS that would cause the resulting IP packets to be larger than 1500 bytes. The MSS would be MTU-40 for IPv4, and MTU-60 for IPv6 (as IPv6 header is 40 bytes long); operating systems may decide to indicate even slightly smaller MSS for internal reasons.

However, assuming that the MTU on A and B is truly 1500 bytes, and that A has 9000 bytes to send to B, then both A and B would indicate an MSS of 1460 bytes to each other, and A could truly accomplish the transmission by 6 segments carrying 1460 bytes of this data, and the 7th segment carrying the remaining 240 bytes. Note that the receiver window size did not play a role in creating these segments - it will play a role when scheduling their transmission, so that A never sends out more bytes in one "batch" (a burst of segments) than B can process.

What I've just described is a naive TCP implementation. Current TCP implementations, however, do this in a significantly more complex way. For example, they may adapt the size of the last segment so that they exactly fill the complete receiver window if they do not receive any acknowledgment from the peer. More importantly, though, every TCP sender has its own sender window, also called the congestion window (cwnd). The size of this window is computed locally and is not indicated between the TCP peers. It can shrink and grow based on time, the number of required retransmissions to the neighbor, and other parameters, and it also impacts the amount of in-flight data; the sender will always send at most min(cwnd, rwnd) bytes segmented in any way it pleases before stopping and waiting for an acknowledgment. Different TCP implementations differ in how they manage the cwnd size, and they are truly many to choose from:

https://en.wikipedia.org/wiki/TCP_congestion_control

Assuming the naive TCP implementation described above, however, the ACK numbers from B and consequently the sequence numbers from A would be:

1 (in TCP SYN/ACK)
1461 (after receiving 1st 1460 bytes)
2921 (after receiving 2nd 1460 bytes)
4381 (after receiving 3rd 1460 bytes)
5841 (after receiving 4th 1460 bytes)
7301 (after receiving 5th 1460 bytes)
8761 (after receiving 6th 1460 bytes)
9001 (after receiving the remaining 240 bytes)

There is one more thing that might be confusing at first. While the receiver window allows the sender to send a certain amount of data without waiting for an acknowledgment from the receiver, it does not mean that the receiver will only send an acknowledgment after receiving the whole window of data. Quite the contrary is true - the receiver is free to send acknowledgments as new and new data are being received from the sender. This allows the data transmission to be smooth - not send-and-wait-for-ack, but rather keep-sending-as-acks-are-arriving.

Check out this demo:

http://www2.rad.com/networks/2004/sliding_window/

It has its number of simplifications and inaccuracies but it does explain lots of things. Play with small and large windows to see how the impact the waiting till new data can be sent.

OSPF calculates the cost using bandwidth and chooses a path which provides best bandwidth. If the receiver's window size is extremely small, I think that data transfer speed would be almost same regardless of the bandwidth.

It is better not to draw any relations between OSPF path selection criteria, and a throughput in a TCP session because these two things are really unrelated. Of course, OSPF tries to pick the fastest path (if we really digged deep into the meaning of OSPF metric, OSPF picks the path of the smallest total latency, following the idea "the faster the link, the smaller the serialization delay, thus smaller latency"), but the OSPF metric has no direct meaning in relation to the true throughput of the the whole path. The receiver window does have an impact on the overall speed of data transmission - with a very small receiver window, the sender will send a small chunk of data and then wait till an ACK comes back, then send again a very small chunk of data, and wait for the next ACK, and that waiting will slow down the transmission - try setting the window size to 1 in that demo above and see for yourself. However, the window size is not directly related to the path throughput as it exists in the network.

Feel welcome to ask further!

Best regards,
Peter

View solution in original post

Peter Paluch · ‎09-10-2017

Hi,

My understanding is that if data received from the application layer is bigger than MSS(Probably 1460 byte), data gets segmented.

If we are talking about TCP, then this is absolutely correct. However, keep in mind that TCP is allowed to make a smaller segment anytime it wishes to do so - not all TCP segments sent by a particular host in a given connection have to carry 1460 bytes in their payload. There are multiple reasons for that - for example, the congestion window on the sender side was smaller than 1460 bytes, or there were less than 1460 bytes ready to be sent, and TCP did not want to wait too long for additional bytes to be ready, as it would introduce overly large delays.

If the segment received from L4 + IP Header is bigger than MTU(1500 byte), it gets fragmented.

That is also correct. Technically, what gets fragmented is the entire resulting IP packet along with the TCP segment inside.

So, segments are made in L4 and packets are made in L3.

Absolutely correct.

When I learned ICND 1, I was taught that packet and datagram are interchangeably same.

The terminology is somewhat free in this aspect and different authors may suggest different terminology. It has been fairly universally accepted that frame describes an L2 message, packet describes an L3 message, and segment describes an L4 message.

As for datagram, this term has a much wider and less strict meaning. I personally consider the term "datagram" to stand for "message of a certain type". In other words, frame is an L2 datagram, packet is an L3 datagram, segment is an L4 datagram. The "datagram" itself does not have the connotation of any particular layer; instead, it just describes a well formed message originated by one of the layers. Even UDP - User Datagram Protocol - uses the word "datagram" in its name without being exclusive about it. You probably remember the Protocol Data Unit, or PDU. Well, in my personal understanding, the datagram is the same as a PDU.

If A needs to send 9000 byte of data to B and A figured out that B has a window size of 8100 byte through 3 way handshake. A will segment the 9000 byte to 6 1460 byte and 1 240 byte. Since B has a window size of 8100 byte, A will send 5 segments and wait ACK from B. I know that first 5 packets will have different sequence numbers. Will 5 packets' sequence number be B's last ACK +1, +2, +3, +4, +5?

This is a more complicated topic.

Do not confuse window size and a maximum segment size (MSS). A window size indicated in each TCP segment, also called the receiver window (rwnd), tells the other TCP peer how many bytes can it send without waiting for any acknowledgment. It does not matter how those bytes are segmented - whether is one large segment or many smaller ones. The receiver window size simply tells how many bytes can be "in flight" because the receiver has enough memory to store them when they come. This indicated window size keeps changing all the time - if you captured a TCP session when, say, downloading an ISO image, and checked your TCP ACK segments and the window size they indicate, the value there would fluctuate and keep changing. That is because it always reflects the momentary amount of free space that the operating system has available for this particular TCP session to receive in-flight data.

In addition, if A and B knew that their IP MTU is 1500, they would never indicate an MSS that would cause the resulting IP packets to be larger than 1500 bytes. The MSS would be MTU-40 for IPv4, and MTU-60 for IPv6 (as IPv6 header is 40 bytes long); operating systems may decide to indicate even slightly smaller MSS for internal reasons.

However, assuming that the MTU on A and B is truly 1500 bytes, and that A has 9000 bytes to send to B, then both A and B would indicate an MSS of 1460 bytes to each other, and A could truly accomplish the transmission by 6 segments carrying 1460 bytes of this data, and the 7th segment carrying the remaining 240 bytes. Note that the receiver window size did not play a role in creating these segments - it will play a role when scheduling their transmission, so that A never sends out more bytes in one "batch" (a burst of segments) than B can process.

What I've just described is a naive TCP implementation. Current TCP implementations, however, do this in a significantly more complex way. For example, they may adapt the size of the last segment so that they exactly fill the complete receiver window if they do not receive any acknowledgment from the peer. More importantly, though, every TCP sender has its own sender window, also called the congestion window (cwnd). The size of this window is computed locally and is not indicated between the TCP peers. It can shrink and grow based on time, the number of required retransmissions to the neighbor, and other parameters, and it also impacts the amount of in-flight data; the sender will always send at most min(cwnd, rwnd) bytes segmented in any way it pleases before stopping and waiting for an acknowledgment. Different TCP implementations differ in how they manage the cwnd size, and they are truly many to choose from:

https://en.wikipedia.org/wiki/TCP_congestion_control

Assuming the naive TCP implementation described above, however, the ACK numbers from B and consequently the sequence numbers from A would be:

1 (in TCP SYN/ACK)
1461 (after receiving 1st 1460 bytes)
2921 (after receiving 2nd 1460 bytes)
4381 (after receiving 3rd 1460 bytes)
5841 (after receiving 4th 1460 bytes)
7301 (after receiving 5th 1460 bytes)
8761 (after receiving 6th 1460 bytes)
9001 (after receiving the remaining 240 bytes)

There is one more thing that might be confusing at first. While the receiver window allows the sender to send a certain amount of data without waiting for an acknowledgment from the receiver, it does not mean that the receiver will only send an acknowledgment after receiving the whole window of data. Quite the contrary is true - the receiver is free to send acknowledgments as new and new data are being received from the sender. This allows the data transmission to be smooth - not send-and-wait-for-ack, but rather keep-sending-as-acks-are-arriving.

Check out this demo:

http://www2.rad.com/networks/2004/sliding_window/

It has its number of simplifications and inaccuracies but it does explain lots of things. Play with small and large windows to see how the impact the waiting till new data can be sent.

OSPF calculates the cost using bandwidth and chooses a path which provides best bandwidth. If the receiver's window size is extremely small, I think that data transfer speed would be almost same regardless of the bandwidth.

It is better not to draw any relations between OSPF path selection criteria, and a throughput in a TCP session because these two things are really unrelated. Of course, OSPF tries to pick the fastest path (if we really digged deep into the meaning of OSPF metric, OSPF picks the path of the smallest total latency, following the idea "the faster the link, the smaller the serialization delay, thus smaller latency"), but the OSPF metric has no direct meaning in relation to the true throughput of the the whole path. The receiver window does have an impact on the overall speed of data transmission - with a very small receiver window, the sender will send a small chunk of data and then wait till an ACK comes back, then send again a very small chunk of data, and wait for the next ACK, and that waiting will slow down the transmission - try setting the window size to 1 in that demo above and see for yourself. However, the window size is not directly related to the path throughput as it exists in the network.

Feel welcome to ask further!

Best regards,
Peter

169.254.X.Y · ‎09-10-2017

WOW. Again, thank you very much. Your answers are truely and literally priceless!

Joseph W. Doherty · ‎10-19-2017

Way back when, I read datagrams as being a generic term for independent (to the network) chunks of data to be delivered. This versus a network that imposes some kind of end-to-end "circuit". IP packets could be considered to be datagrams (as might Ethernet frames and/or UDP packets).

TCP segments, though, might not be considered as datagrams because TCP builds a session between the two end points and guarantees delivery (including sequencing).

mmeridaa1 · ‎02-25-2018

Hello. I just want to add some old but reliable information found at IETF.org.

According to RC791, datagram is IP data blocks and packet is resultant of datagram fragmentation.

https://tools.ietf.org/html/rfc791#section-1.1: 

Quotation:

..."The internet protocol provides for transmitting blocks of data called datagrams 
from sources to destinations"...
..."The internet protocol also provides for fragmentation and reassembly of long datagrams, 
if necessary, for transmission through "small packet" networks"...

Joseph W. Doherty · ‎02-26-2018

If you're saying, you believe RFC 791 a packet is a fragmented datagram, I believe you might misunderstand what the RFC is saying.

Datagrams is what RFC says IP supports for data transmission across packet-switched networks. Further IP supports fragmentation and reassembly of long datagrams through "small packet" networks.

The way I would read the above, datagrams is being used as a term specific to IP. Further, packet switched networks can carry IP datagrams along with other packets. I.e. an IP datagram is treated as a packet on such network.

mmeridaa1 · ‎02-26-2018

Hi Joseph. I agree with you almost totally.

My english is very bad and I thik that to use the word "resultant" was not a good idea.

I read the RFC like ***a datagram could be fragmented (if necessary) and each fragment is transmitted inside a packet****

Datagram is specific to IP (total agree)

I can not understand something what you wrote: ".......I.e. an IP datagram is treated as a packet on such network".

I will try to explain myself what I understand about RFC 791: Datagram is a block of information and its size could be grater or smaller or equal than the maximum packet size on such network. Only in cases of smaller or equal size between datagram and packets "a datagram could be treated like a packet". In other case (grater than) a datagram is fragmented and any fragment is transmitted like a packet on such networks.

Could you make a comment about you say "...an IP datagram is treated as a packet on such network"

Thanks.

Joseph W. Doherty · ‎02-27-2018

"Could you make a comment about you say "...an IP datagram is treated as a packet on such network""

I read that as, a packet-switched network, is so named, to make it clear that it's different from a circuit-switched networks. Second, a packet-switched network might not be limited to only IP.

"Only in cases of smaller or equal size between datagram and packets "a datagram could be treated like a packet". In other case (grater than) a datagram is fragmented and any fragment is transmitted like a packet on such networks."

Okay, to say it another way, for forwarding purposes, a packet-switched network doesn't "care" whether it's dealing with datagrams that have been fragmented or not. Fragmented or non-fragments packets are forwarded alike.

Where a packet-switched network does care, if a datagram cannot fit into the maximum size packet the medium can handle, the datagram/packet may be fragmented. Again, if fragmented, the fragmented datagrams packets are forwarded generally the same as any other non-fragmented datagram packet. However, the fragmented packets receiver will reconstruct the original datagram.

mmeridaa1 · ‎02-27-2018

Thanks a lot Joseph. Well explained using "packet-conmuted network" vs "circuit-conmuted network" reference. Really clear after read you answer.