I have an 887VAM router that is connected to Amazon VPC with redundant ipsec tunnels. The Amazon generated config uses VTI's so it's a route based tunnel.
My office is 192.168.0.0/16
I am using Amazon VPC as a hub. It is 172.16.0.0/16
I also have a branch office with same 887 router type, which is using a piece of 10.0.0.0/8. He has a tunnel to the VPC too, but instead of conencting to Amazon's VPN GW, this connects to a software appliance (Sophos UTM) vpn gateway in the VPC.
The spoke to hub comms is working fine on both tunnels, and in both directions.
What's not working is spoke to spoke traffic.
My route statements look like this:
ip route 172.16.0.0 255.255.0.0 Tunnel1 track 100
ip route 10.0.0.0 255.0.0.0 Tunnel1 172.16.0.239 track 100
ip route 172.16.0.0 255.255.0.0 Tunnel2 track 200
ip route 10.0.0.0 255.0.0.0 Tunnel2 172.16.0.239 track 200
ip route 0.0.0.0 0.0.0.0 Dialer1
The 172.16.0.239 address is the interface of the Sophos UTM in the VPC. (Amazon GW does not have knowledge of the existence of 10.0.0.0 subnets and where they are located since they are outside the VPC address space).
The routers have IOS version 15.1(4)M4
The route table looks like this:
Gateway of last resort is 0.0.0.0 to network 0.0.0.0
S* 0.0.0.0/0 is directly connected, Dialer1
S 10.0.0.0/8 [1/0] via 172.16.0.239, Tunnel2
[1/0] via 172.16.0.239, Tunnel1
18.104.22.168/32 is subnetted, 2 subnets
C 150.101.xxx.yyy is directly connected, Dialer1
C 150.101.aaa.bbb is directly connected, Dialer1
169.254.0.0/16 is variably subnetted, 4 subnets, 2 masks
C 169.254.247.16/30 is directly connected, Tunnel2
L 169.254.247.18/32 is directly connected, Tunnel2
C 169.254.247.20/30 is directly connected, Tunnel1
L 169.254.247.22/32 is directly connected, Tunnel1
S 172.16.0.0/16 is directly connected, Tunnel2
is directly connected, Tunnel1
192.168.1.0/24 is variably subnetted, 2 subnets, 2 masks
C 192.168.1.0/24 is directly connected, Vlan1
L 192.168.1.1/32 is directly connected, Vlan1
192.168.100.0/24 is variably subnetted, 2 subnets, 2 masks
C 192.168.100.0/24 is directly connected, Vlan2
L 192.168.100.1/32 is directly connected, Vlan2
That all looks OK to me superficially but I'm probably overlooking something here.
The problem is, I don't believe spoke to spoke traffic from 192.168.1.10 directed to 10.1.2.1 is getting into the tunnel. (It seems to be going nowhere).
I was able to verify this by setting up a high-frequency ping and observing the traffic counters on interface Tunnel2 (the active one at this time). Ping traffic to the hub working fine though.
Pretty sure I can't use dynamic routing or BGP here - need to make this work using static routes please.
Not even going to worry about anything else 'downstream' being wrong until I see the traffic is actually getting into the tunnel.
I don't really know enough about the more exotic stuff such as policy-based routes, vrf etc to know if they would be relevant or helpful.
Any advice from the gurus?
Could you please hint me: why don't you use direct tunnel between your locations?
Quite a few reasons really:
1. Most of the traffic is between each branch network and AWS Cloud.
2. Main office tunnel is for managing what's in the cloud, but also need to get into the branch networks too.
3. Minimising number of separate tunnels on each router.
4. Minimsiing long term configuration nightmares...
It's a classic hub and spoke scenario - it should be possible to make this work (without dynamic routing or BGP).
Also, as an aside there is no inter-branch traffic. (So in that respect it is slightly simplier case than needing traffic to flow from every spoke to every other spoke).
I guess your primary question should be: does you providers (Amazon VPC and Sophos device) allow traffic forwarding between your sites.
If yes - then provide use some traces from one site to another and vice versa + routing table.
If not - then you should build you own VPN between sites.
PS: could you please draw a diagram how you plan traffic from branch reaches Sophos, then VPC gateway, then your Hub.
PS2: your statement: "The problem is, I don't believe spoke to spoke traffic from 192.168.1.10 directed to 10.1.2.1 is getting into the tunnel.", - is not clear; could you please add ip-addresses to the diagram?
My belief at this time is "yes" ... traffic forwarding supported by both VPN gateway in the Amazon VPC.
Diagram is as follows:
Main Office <----> Amazon VPNGW .....(VPC)..... Sophos UTM <-----> Branch Office
Cisco 887 "Hub" Cisco 887
192.168.x.x 172.16.x.x 10.x.x.x
Only the private IP's are show for each network. (The publics are irrelevant for the purposes of this discussion)
So to clarify the point in your PS2: However, I can see from packet stats in the main office router that the packets destined for the spoke 10.x.x.x are not even making it into the tunnel, hence there is nothing to trace. It's clearly a routing issue. The Route table is as shown in my original post.
Ok, let's troubleshoot the issue with your tunnel.
1) "show ip cef 10.0.0.1"
2) what type of tunnel are Tu1 and Tu2 - GRE+IPSec, DMVPN ... (sh runn int tu1/tu2 could help)?
3) do you have any NAT on the router?
4) please provide a trace from any client toward 10./8 network;
5) please provide a trace from router toward 10./8 network;
6) the same as 5th, but force router to use Vl2 address as a source.
PS: during troubleshooting you could try to enable "debug ip icmp" for extended diagnosise; I would suggest to have "logg buff 7" + "logg buffer 128000" (or more) + "logg mon 6" + " logg con 3" if you enable such a debug on production network. After tests run "sh logg | i ICMP" + "u all"