深入理解Linux网络技术内幕——IPv4 报文的传输发送
报文传输,指的是报文离开本机,发往其他系统的过程。
传输可以由L4层协议发起,也可以由报文转发发起。
在深入理解Linux网络技术内幕——IPv4 报文的接收(转发与本地传递)一文中,我们可以看到,报文转发最后会调用dst_output与邻居子系统进行交互,然后传给设备驱动程序。 这里,我们从L4层协议发起的传输,最后也会经历这一过程(调用dst_output)。本文讨论的是L4层协议发起的传输,在IPv4协议处理(IP层)中的一些环节。
大蓝图
传输环节-内核的主要任务
ip_queue_xmit情形
//由tcp、sctp使用 //skb:封包描述符 //ipfragok: sctp使用的标志,指明是否可以分段 int ip_queue_xmit(struct sk_buff *skb, int ipfragok) { struct sock *sk = skb->sk; struct inet_sock *inet = inet_sk(sk); //要通过的套接字 struct ip_options_rcu *inet_opt = NULL; struct rtable *rt; struct iphdr *iph; int res; /* Skip all of this if the packet is already routed, * f.e. by something like SCTP. */ rcu_read_lock(); rt = skb_rtable(skb); //如果缓冲区已经设置了正确的路由信息,就不需要查找路由表了 if (rt != NULL) goto packet_routed; /* Make sure we can route this packet. */ rt = (struct rtable *)__sk_dst_check(sk, 0); inet_opt = rcu_dereference(inet->inet_opt); //选项初始化 if (rt == NULL) { __be32 daddr; /* Use correct destination address if we have options. */ daddr = inet->daddr; if (inet_opt && inet_opt->opt.srr) daddr = inet_opt->opt.faddr; { struct flowi fl = { .oif = sk->sk_bound_dev_if, .mark = sk->sk_mark, .nl_u = { .ip4_u = { .daddr = daddr, .saddr = inet->saddr, .tos = RT_CONN_FLAGS(sk) } }, .proto = sk->sk_protocol, .flags = inet_sk_flowi_flags(sk), .uli_u = { .ports = { .sport = inet->sport, .dport = inet->dport } } }; /* If this fails, retransmit mechanism of transport layer will * keep trying until route appears or the connection times * itself out. */ security_sk_classify_flow(sk, &fl); if (ip_route_output_flow(sock_net(sk), &rt, &fl, sk, 0)) goto no_route; } sk_setup_caps(sk, &rt->u.dst); } skb_dst_set(skb, dst_clone(&rt->u.dst)); packet_routed: if (inet_opt && inet_opt->opt.is_strictroute && rt->rt_dst != rt->rt_gateway) goto no_route; /* OK, we know where to send it, allocate and build IP header. */ //把skb-》data往回移动,使其指向ip报头(而不是数据段) skb_push(skb, sizeof(struct iphdr) + (inet_opt ? inet_opt->opt.optlen : 0)); skb_reset_network_header(skb); /* 构建ip报头*/ iph = ip_hdr(skb); *((__be16 *)iph) = htons((4 << 12) | (5 << 8) | (inet->tos & 0xff)); if (ip_dont_fragment(sk, &rt->u.dst) && !ipfragok) iph->frag_off = htons(IP_DF); else iph->frag_off = 0; iph->ttl = ip_select_ttl(inet, &rt->u.dst); iph->protocol = sk->sk_protocol; iph->saddr = rt->rt_src; iph->daddr = rt->rt_dst; /* Transport layer set skb->h.foo itself. */ if (inet_opt && inet_opt->opt.optlen) { iph->ihl += inet_opt->opt.optlen >> 2; ip_options_build(skb, &inet_opt->opt, inet->daddr, rt, 0); } ip_select_ident_more(iph, &rt->u.dst, sk, (skb_shinfo(skb)->gso_segs ?: 1) - 1); skb->priority = sk->sk_priority; skb->mark = sk->sk_mark; res = ip_local_out(skb); rcu_read_unlock(); return res; no_route: rcu_read_unlock(); IP_INC_STATS(sock_net(sk), IPSTATS_MIB_OUTNOROUTES); kfree_skb(skb); return -EHOSTUNREACH; }
ip_push_pending_frames的情形
ip_append_data
郑重声明:本站内容如果来自互联网及其他传播媒体,其版权均属原媒体及文章作者所有。转载目的在于传递更多信息及用于网络分享,并不代表本站赞同其观点和对其真实性负责,也不构成任何其他建议。