2
0
mirror of https://github.com/knorrie/network-examples synced 2025-08-31 06:26:12 +00:00

Merge branch 'wip'

This commit is contained in:
Hans van Kranenburg
2016-02-07 18:30:42 +01:00
116 changed files with 3790 additions and 1 deletions

View File

@@ -1,4 +1,54 @@
Network Examples
================
This is a work in progress, nothing interesting to see yet. Please come back later.
Welcome to my Linux Networking tutorials. The first part, learning two widely used routing protocols, OSPF and BGP, is almost completed.
## Target audience
You're a Linux server and network administrator for some years, have been building an office and/or colocation network with IPv4, IPv6, firewalls with IPTables, some stateful filtering (and NAT for IPv4). You've set up VPN tunnels between different locations to be able to reach the internal IPv4 network using RFC1918 addresses on the other side.
You know how to use the `iproute2` programs (`ip a`, `ip r`, etc) to set up your networking, and haven't typed the `ifconfig` or `route` commands in your terminal since 1999. You know how to debug problems using `tcpdump`, `traceroute` etc...
But... your network doesn't show much redundancy. You're pointing a static route to your VPN tunnel to be able to reach the other side, and when you connect a third location, you're realizing that this way of working getting more and more painful.
You're googling the interwebz for tutorials about an introduction to routing protocols, and discover that most of them start out with a huge amount of theoretical information, and then teach you how to configure Cisco equipment.
You found out that there's helper software to make routing more dynamic, like BIRD, but you have no idea where to start. You don't have money to buy physical routers and switches and cables to connect them, and got a headache after once playing around with a cisco simulator for an hour, missing your cozy linux command line.
## Does the web need more tutorials on these topics?
There aren't a lot of tutorials that take a practical approach, ignoring most of the theory that's not yet needed to be known before just starting to explore the working of e.g. a routing protocol. In my opinion there is really no use in first of all learning that "a type 5 AS-external Link State Advertisement is not allowed in a Not So Stubby Area, unless it's a type 7 Link State Advertisement, that has been converted to a type 5 at an Area Border router, which makes it possible to traverse a..." whatever... :o)
And, while BIRD (the routing software used in these turotials) has good reference documentation, it's missing tutorial-style documentation for new users. Even if you already know routing protocols, these tutorials should be a quick way to learn using BIRD and some of its quircks on Linux.
I hope that while following the tutorials, you're going to experience the "Wow!" moments, that "make the penny drop" and suddenly make you understand a lot more about complex networking topologies on the internet. Just reading the pages will not make that happen, setting up the examples and doing the assignments will.
Depending on your current knowledge and experience, following the tutorials can take quite some time. This is normal. Don't skip pages or part of them, because when writing, I assume that everything told before is known already. Even though the tutorials are meant as an introduction, there's still already a huge amount of information hidden in them.
## Getting up to speed setting up test environments
All test setups used in the tutorials are built using Linux, LXC and OpenvSwitch for setting up network topologies, and using BIRD as routing daemon.
* [Setting up a lab environment](/lxcbird/README.md) explains how to set up a local test environment to build example networks using LXC and openvswitch. The result of the tutorial is having a base container that can be cloned into routers and end hosts.
* [A basic network example](/birdhouse-intro/README.md) provides an introduction to the computer network of a fictional company, the Birdhouse Factory. The tutorial part is to practice some more with setting up a network with containers and openvswitch.
## Learning OSPF and BGP
* [The Birdhouse Factory continued](/birdhouse-vlans-vpn/README.md) shows how the network at the Birdhouse Factory is evolving, and shows the need for a dynamic routing protocol when multiple routers are introduced.
* [An introduction to OSPF](/ospf-intro/README.md) explains the basics of using OSPF as an IGP.
* [An introduction to BGP](/bgp-intro/README.md) shows using BGP to make a connection to an external network that is managed by someone else. Also shows how the routes learned are propagated into the local network, having OSPF and iBGP work together.
* [A bigger BGP network](/bgp-contd/README.md) shows redundant routes, asymmetric traffic flow and explains the difference between "peering" and "transit".
* To be finished: [Wait what... The Internet](/routing-on-the-internet/README.md) shows that with the little amount of knowledge we built up about routing, we can suddenly understand how the whole Internet works! (So, fun with traceroute, bgp.he.net, etc...)
## Further ideas
* Building a pair of stateful filtering IPv4 and IPv6 gateways that can fail over traffic to each other, using `keepalived` and `conntrackd`, while participating in OSPF.
* Building a load balancer using LVS.
* ...
## Feedback, getting help
* I'm very interested in feedback on the amount of time it takes to work through the tutorials. Since I wrote them, I cannot predict those, and it may provide useful information about reordering the pages if they're too long.
* If you like IRC, try the `#bird` channel on Freenode. There's a bunch of friendly BIRD users in there that might help you with the BIRD configuration if you have questions.
Hans van Kranenburg, `hans@knorrie.org`

345
bgp-contd/README.md Normal file
View File

@@ -0,0 +1,345 @@
BGP, Part II
============
In [BGP, Part I](/bgp-intro/README.md), our knowledge of OSPF to manage routing details inside a network was extended with the ability to connect entire networks together, hiding the detailed topology of one network to the other. Since it was done with a single BGP connection between only two networks, it's time for something more extensive now. Let's throw more networks and some redundancy into the game:
![BGP network with redundant paths](/bgp-contd/bgp-redundancy.png)
Hint: print this picture so you can make notes on it and keep it in sight during the tutorial!
In the picture, we see three networks, which are connected by several links that use the BGP protocol to exchange routing information. I've deliberately kept the internal structure of the networks as simple as possible.
* `AS65000` and `AS65010`, the left and right network, have a redundant link between them. When properly configured, this should make it possible to do maintenance on either of the two connections (e.g. shutting down `R0` and `R11`, or just one of them, or the physical link in between) without any interruption of network traffic between the two networks.
* The links between `R0` and `R11`, and between `R1` and `R10` are high bandwidth, low latency links.
* `AS65000` and `AS65010`, the two bigger networks, run OSPF inside, so that e.g. `R2` learns about the subnets that are used for the interconnects to the neighbour networks (to be able to resolve the the BGP next hop on `R2`).
* `AS65020` is a smaller network, that has only a single gateway connecting it to the outside world. It's connected to both of the bigger networks with links that have a relatively lower bandwidth and higher latency. During the tutorial, we'll make adjustments to the configuration so `AS65020` can still reach both `AS65000` and `AS65010` when one of the external links is down.
* Ah, and this time it's an IPv6 only example.
## Setting up the example network
Well, you known the drill. :-)
Thankfully, most of the configuration is provided already, so we can quickly set up this whole network using our LXC environment. Just like in the previous tutorials, the birdbase container can be cloned, after which the lxc network information and configuration inside the containers can be copied into them.
1. Clone this git repository somewhere to be able to use some files from the bgp-contd/lxc/ directory inside.
2. lxc-clone the birdbase container several times:
for router in 0 1 2 10 11 12 20; do lxc-clone -s birdbase R$router; done
3. Set up the network interfaces in the lxc configuration. This can be done by removing all network related configuration that remains from the cloned birdbase container, and then appending all needed interface configuration by running the fixnetwork.sh script that can be found in `bgp-contd/lxc/` in this git repository. Of course, have a look at the contents of the script first, before executing it.
. ./fixnetwork.sh
4. Copy extra configuration into the containers. The bgp-intro/lxc/ directory inside this git repository contains a little file hierarchy that can just be copied over the configuration of the containers. For each router, it's a network/interfaces configuration file which adds an IP address that corresponds with the Router ID to the loopback interface, and a simple BIRD configuration file that serves as a starting point for our next steps.
5. Start all containers
for router in 0 1 2 10 11 12 20; do echo "Starting R${router}..."; lxc-start -d -n R$router; sleep 2; done
## Looking around...
There's a lot of bird config already in place, which looks like the Part I config, but multiple times for each connection. Take some time to browse through the bird6.conf files on all routers, so you can grasp the idea of how the configuration is set up. Hint: you can also do this from outside the lxc containers, so you can easily open multiple files and compare them.
Note: some parts of the configuration are still missing, because we'll be adding them while doing the tutorial. If you can already spot something that is missing now, you receive some bonus points! :)
### ...from the viewpoint of R2
`lxc-attach` to `R2` and verify the routes to the other two networks from there. It might take a few extra seconds after starting all the containers for them to figure out the complete network topology and set up all routing protocol sessions and exchange all routing information.
* Do a `birdc6 show route` and `birdc6 show route all 2001:db8:10::/48`.
* `traceroute 2001:db8:10::12` to `R12` in `AS65010`
* Look at the route from `R2` to `R20`: `birdc6 show route all 2001:db8:20::/48`
* `traceroute 2001:db8:20::20` to `R20` in `AS65020`
The output of the commands should show that `R2` knows two different routes to `2001:db8:10::/48`. One of them gets chosen to end up in the linux kernel routing table (marked with the `*`), and the information about the route shows from which ibgp connection the route was learned. In my case here it's the route learned from the ibgp session `ibgp_r0` that gets chosen, but it might as well be the one via `R1` in your case.
For traffic to `R20`, there's only one route shown from the viewpoint of `R2`, which is pointing to `R1`, since this is the only router in the local network that has a connection to the remote `AS65020`.
## Asymmetric traffic flows
Since there are two active network paths between `AS65000` and `AS65010`, each of the networks is free to choose which connection to use to send traffic to the other network.
Let's do some traceroute again, to look at traffic flow between `R2` and `R12`:
root@R2:/# traceroute r12
traceroute to r12 (2001:db8:10::12), 30 hops max, 80 byte packets
1 lan.r0 (2001:db8:0:1::ff) 0.384 ms 0.385 ms 0.389 ms
2 ebgp_r0.r11 (2001:db8:0:3::11) 0.565 ms 0.577 ms 0.490 ms
3 lo.r12 (2001:db8:10::12) 1.081 ms 1.012 ms 1.014 ms
root@R12:/# traceroute r2
traceroute to r2 (2001:db8::2), 30 hops max, 80 byte packets
1 lan.r10 (2001:db8:10:2::10) 0.292 ms 0.290 ms 0.369 ms
2 ebgp_r10.r1 (2001:db8:10:4::1) 0.435 ms 0.375 ms 0.392 ms
3 lo.r2 (2001:db8::2) 0.829 ms 0.785 ms 0.770 ms
Thanks to the information in the `/etc/hosts` file in the containers, the output shows us the names of the network interfaces that correspond to the used addresses.
As you can see, `R2` chooses to send traffic over `R0` as next hop, which will forward it to `R11` to get it into `AS65010`. In the meantime, traffic in the other direction chooses the path over `R10` and `R1`. When receiving traffic, a router has no idea of the path that a packet has traveled along before arriving. When sending traffic back, the router will just use its own thoughts about the best path towards that destination, which might mean choosing an outgoing network interface that is different from the one the packet it's responding to was received on.
_Understanding asymmetric traffic flow is essential in the process of debugging network troubles in a larger network._
Let me give you an example why. Say, you're debugging latency on a connection to a remote host (look at the rtt measurements):
root@R2:/# traceroute r12
traceroute to r12 (2001:db8:10::12), 30 hops max, 80 byte packets
1 lan.r0 (2001:db8:0:1::ff) 0.389 ms 0.389 ms 0.398 ms
2 ebgp_r0.r11 (2001:db8:0:3::11) 0.614 ms 0.572 ms 0.525 ms
3 lo.r12 (2001:db8:10::12) 101.208 ms 101.121 ms 101.142 ms
When you're not aware of the possibility of asymmetric traffic flows, you could incorrectly assume that there's a problem with the network link between `R11` and `R12`, because of the introduced extra latency. However, there might be multiple other possibilities, since you do not know which route the traffic from `R12` back to you takes. In our little example network we know, since we just found out by looking at it from both ends. It could as well be the link between `R12` and `R10`, or between `R10` and `R1`, or even between `R1` and `R2`...
Let's have a look at a trace from the other end back to `R2`:
root@R12:/# traceroute r2
traceroute to r2 (2001:db8::2), 30 hops max, 80 byte packets
1 lan.r10 (2001:db8:10:2::10) 0.402 ms 0.341 ms 0.330 ms
2 ebgp_r10.r1 (2001:db8:10:4::1) 200.490 ms 200.472 ms 200.453 ms
3 lo.r2 (2001:db8::2) 101.053 ms 101.039 ms 101.020 ms
Router `R10` can be reached just fine, but the next step (`ebgp_r10.r1` is the network interface on `R1` that is looking at `R10`) shows quite some introduced latency. Since the shortest route from `R1` back to `R12` is by sending it back to `R10` directly, the total round trip in the second step shows twice the amount of extra latency, while the total round trip time for reaching `R2` only shows the introduced latency once.
Here's an edited version of the traceroute output, with the paths mentioned:
root@R2:/# traceroute r12
traceroute to r12 (2001:db8:10::12), 30 hops max, 80 byte packets
1 lan.r0 r2 -> r0 (ttl expired), r0 -> r2
2 ebgp_r0.r11 r2 -> r0 -> r11 (ttl expired), r11 -> r0 -> r2
3 lo.r12 r2 -> r0 -> r11 -> r12 (destination), r12 -> r10 -> r1 -> r2
root@R12:/# traceroute r2
traceroute to r2 (2001:db8::2), 30 hops max, 80 byte packets
1 lan.r10 r12 -> r10 (ttl expired), r0 -> 12
2 ebgp_r10.r1 r12 -> r10 -> r1 (ttl expired), r1 -> r10 -> r12
3 lo.r2 r12 -> r10 -> r1 -> r2 (destination), r2 -> r0 -> r11 -> r12
Introducing latency for test purposes can be done with the linux traffic control tooling. Here's the two commands I just used on `R1` and `R10` to achieve the effect shown above:
root@R1:/# tc qdisc add dev ebgp_r10 root netem delay 100ms
root@R10:/# tc qdisc add dev ebgp_r1 root netem delay 100ms
## Playing with redundant paths
Having redundancy in the network has the advantage that the network can survive a partial outage, which can be either planned (maintenance) or unplanned (failure of a component):
![BGP network with an inactive path](/bgp-contd/bgp-redundancy-ibgp-r0-r1.png)
### Moving around traffic
By disabling the BGP session on `R0` towards `R11`, we can force traffic between `AS65000` and `AS65010` to choose the route over `R1` and `R10` instead:
root@R0:/# birdc6
BIRD 1.4.5 ready.
bird> show protocols
name proto table state since info
kernel1 Kernel master up 15:07:17
device1 Device master up 15:07:17
ospf1 OSPF master up 15:07:17 Running
static1 Static master up 15:07:17
p_master_to_bgp Pipe master up 15:07:17 => t_bgp
originate_to_r11 Static t_r11 up 15:07:17
ebgp_r11 BGP t_r11 up 15:07:34 Established
p_bgp_to_r11 Pipe t_bgp up 15:07:17 => t_r11
ibgp_r2 BGP t_bgp up 15:08:17 Established
bird> disable ebgp_r11
ebgp_r11: disabled
When doing some `mtr` from `R2` to `R12`, you can see the path switch over to the other connection, while disabling the eBGP session on `R0`:
root@R2:/# mtr r12
My traceroute [v0.85]
R2 (::) Sun Nov 29 15:30:58 2015
Keys: Help Display mode Restart statistics Order of fields quit
Packets Pings
Host Loss% Snt Last Avg Best Wrst StDev
1. 2001:db8:0:1::ff 0.0% 36 0.1 0.1 0.1 0.5 0.0
2001:db8:0:1::1
2. 2001:db8:0:3::11 0.0% 36 0.1 0.1 0.1 0.4 0.0
2001:db8:10:4::10
3. 2001:db8:10::12 0.0% 35 0.1 0.1 0.1 0.5 0.0
Also, the bird6 log file in `/var/log/bird/bird6.log` shows that bird gets an update from the iBGP session to `R0` about the fact that the route to `2001:db8:10::/48` over `R0` should no longer be used, following by a replacement of the route in the linux kernel, pointing to `R1` instead:
ibgp_r0 > removed [replaced] 2001:db8:10::/48 via fe80::4ef:8ff:fe02:cef6 on lan
kernel1 < replaced 2001:db8:10::/48 via fe80::bc5d:a8ff:fee4:c062 on lan
Note that shutting down a BGP session only will stop the exchange of routing information. The link itself is not disabled in this example. This means that if for whatever reason (like manual configuration of routes) traffic would arrive over this link, it would still happily be handled by the linux kernel.
### Fixing iBGP
As I said a little earlier, there is still some configuration missing, although we didn't spot it yet it seems. Well, if you try to reach any router in `AS65010` from `R0`, you will see it fail:
root@R0:/# traceroute r11
traceroute to r11 (2001:db8:10::11), 30 hops max, 80 byte packets
connect: Network is unreachable
Also, `R20` is not reachable from `R0`, while the connection between `AS65000` and `AS65020` is still active...
root@R0:/# traceroute r20
traceroute to r20 (2001:db8:20::20), 30 hops max, 80 byte packets
connect: Network is unreachable
In the BGP introduction tutorial, we learned that iBGP sessions are used to share information about reachability of remote networks, outside of the own AS. By setting up an iBGP connection between `R2` and both `R0` and `R1`, we can make sure that `R2` is kept up to date about a path to external networks that are connected via `R0` and `R1`. However, the same has to be done between `R0` and `R1` to make sure that it knows an alternative route to the remote network `AS65010` when its own connection to it is down, and also knows to reach `AS65020` using `R1`.
### Assignments
Now, do the following things:
* Add iBGP configuration to share BGP routes between `R0` and `R1`, and also between `R10` and `R11`. Pay special attention to the `import` and `export` rules. Use e.g. `show route protocol ibgp_r0` on `R1` to verify that it's receiving information about external routes.
bird> show route protocol ibgp_r0
2001:db8:10::/48 via fe80::4ef:8ff:fe02:cef6 on lan [ibgp_r0 23:24:03 from 2001:db8::ff] (100/20) [AS65010i]
bird> show route export ibgp_r0
2001:db8:20::/48 via 2001:db8:0:5::20 on ebgp_r20 [ebgp_r20 2015-11-28] * (100) [AS65020i]
2001:db8:10::/48 via 2001:db8:10:4::10 on ebgp_r10 [ebgp_r10 2015-11-28] * (100) [AS65010i]
* Check that you can reach every external network from every router in all of the three networks.
* Try disabling some of the links between routers by using the `disable`/`enable` commands on the bird command line, and check if you still can reach all parts of the network.
* Change `import` and `export` filters in the `protocol bgp ebgp_r*` sections in `bird6.conf` so that you end up with a situation where all traffic is forced into an asymmetric traffic pattern in which traffic from `AS65000` to `AS65010` has to leave via `R1` to `R10`, and traffic back flows over `R11` to `R0`. Verify the changes seen in bird `show route all` output when you change filters.
## A closer look at the BIRD configuration
Here's a picture of the tables and protocols used in the BIRD configuration of `R1`:
![BIRD protocols, tables, import and export](/bgp-contd/bird-prototable.png)
As you might have noticed, I prefer using multiple internal routing tables in BIRD in favor of less complex filters. Since we're dealing with a very limited number of routes it's not a problem at all that a lot of routes are duplicated in multiple tables.
If you compare this drawing to the previous one from the BGP Introduction tutorial, you'll notice an extra table, in between the eBGP sessions and the master table. Here's the corresponding part from `bird6.conf`:
##############################################################################
# BGP table
#
# Use this routing table to gather external routes received via BGP which we
# want push to the kernel via our master table and to other routers in our AS
# via iBGP or even to other routers outside our AS again (transit), which can
# be connected here or to a router elsewhere on the border of our AS.
table t_bgp;
protocol pipe p_master_to_bgp {
table master;
peer table t_bgp;
import all; # default
export none; # default
}
eBGP sessions are connected to `t_bgp` via an intermediate table for their own, which is used to insert routes that are originated from our own AS that we want to announce to the other side. iBGP sessions are connected to the `t_bgp` directly, as they just need to share the collection of externally learned routes with routers inside the network.
### Assignments
* Compare the drawing with the configuration file of `R1`.
## Redundancy for the branch office: Transit traffic
The following picture is the same as the one at the beginning of this page:
![BGP network with redundant paths](/bgp-contd/bgp-redundancy.png)
Until now, we have ignored the top network, with `R20` in it. Let's have a better look at this part now. `AS65020` is connected to both of the other networks with a single connection.
What would the result be for the connectivity of `AS65020` if one of those links would be down because of maintenance or a defect? We can do some tests to find out.
First make sure you can reach `R20` from `R10`:
root@R10:/# traceroute r20
traceroute to r20 (2001:db8:20::20), 30 hops max, 80 byte packets
1 lan.r11 (2001:db8:10:2::11) 0.901 ms 0.883 ms 0.868 ms
2 lo.r20 (2001:db8:20::20) 0.897 ms 0.804 ms 0.803 ms
Next, shut down eBGP between `R11` and `R20`, and see what happens...
root@R11:/# birdc6
BIRD 1.4.5 ready.
bird> disable ebgp_r20
ebgp_r20: disabled
root@R10:/# traceroute r20
traceroute to r20 (2001:db8:20::20), 30 hops max, 80 byte packets
connect: Network is unreachable
There's still an open network path to `R20`, via `R1`. But, `R10` is not aware of this, because the routers in `AS65000` do not tell the ones in `AS65010` that they also know a path to `R20`...
root@R1:/# birdc6
BIRD 1.4.5 ready.
bird> show route table t_bgp
2001:db8:20::/48 via 2001:db8:0:5::20 on ebgp_r20 [ebgp_r20 20:47:42] * (100) [AS65020i]
2001:db8:10::/48 via 2001:db8:10:4::10 on ebgp_r10 [ebgp_r10 2015-11-28] * (100) [AS65010i]
via fe80::4ef:8ff:fe02:cef6 on lan [ibgp_r0 2015-12-01 from 2001:db8::ff] (100/20) [AS65010i]
bird> show route export ebgp_r10
2001:db8::/48 blackhole [originate_to_r10 2015-11-28] * (200)
As seen above in the configuration diagram, the routers that connect to external networks in `AS65000` and `AS65010` collect all external routes in their BIRD `t_bgp` table, so they can be sent over iBGP to the other routers in their network. However, as you can see in the `bird6.conf` configuration files of them, the routes in `t_bgp` are not exported again to external peers.
### Assignments
![BGP Transit](/bgp-contd/bgp-redundancy-transit1.png)
* Change the BIRD configuration of `R1` so that externally learned routes are exported again to other external networks. The routes are available in the `t_bgp` table, and can simply all be exported again towards `R10` and `R20`. Now check that you can already reach `R20` from `R10` with a traceroute, which means that `R20` also knows a path back to `R10` now:
root@R10:/# traceroute r20
traceroute to r20 (2001:db8:20::20), 30 hops max, 80 byte packets
1 ebgp_r10.r1 (2001:db8:10:4::1) 0.330 ms 0.317 ms 0.309 ms
2 lo.r20 (2001:db8:20::20) 0.441 ms 0.432 ms 0.405 ms
* In the logfiles in `/var/log/bird/bird6.log` on `R20`, `R1` and `R10` you should see that the extra routing information was received, and how bird processed the routes internally. Pay some attention to the 'ignored', 'filtered' and 'rejected by protocol' lines. They show that the defined filters are used, and they also show that bird will by default be clever about not pushing routes back through a pipe or protocol it just learned them from, which simplifies the filter definitions a lot, since we can just use 'all' for exporting from `t_bgp` to e.g. `t_r20`.
* On `R1`, check `show route export ebgp_r10` and `show route export ebgp_r20`.
* Adjust the other three routers in the same way: `R0`, `R10` and `R11`. Don't change `R20`, since we do not want `AS65020`, with its two slow low-bandwith links to be a transit area for traffic between `AS65000` and `AS65010`. Those two networks already have fast redundant links between them.
* Enable the BGP session between `R11` and `R20` again, and notice that the routing immediately switches back to using this path.
* Disable the network path between `R1` and `R20`, and make sure you can still reach `R20` from `R2`.
### Extra assignment
Instead of disabling a whole BGP session between routers to stop using a particular path, it's also possible to keep the BGP connection alive, and just stop originating prefixes or re-announcing them if we're a transit network, but still accept them from the remote or just the other way around. When doing so we can configure a situation with an asymmetric traffic flow.
![BGP Transit fun asssignment](/bgp-contd/bgp-redundancy-transit2.png)
* Without disabling any BGP session, change the filters configuration so that traffic flow between `R10` and `R20` is as shown in the picture above:
root@R10:/# traceroute r20
traceroute to r20 (2001:db8:20::20), 30 hops max, 80 byte packets
1 lan.r11 (2001:db8:10:2::11) 0.068 ms 0.018 ms 0.016 ms
2 ebgp_r11.r0 (2001:db8:0:3::ff) 0.380 ms 0.356 ms 0.341 ms
3 ebgp_r10.r1 (2001:db8:10:4::1) 0.462 ms 0.458 ms 0.388 ms
4 lo.r20 (2001:db8:20::20) 0.401 ms 0.409 ms 0.435 ms
root@R20:/# traceroute r10
traceroute to r10 (2001:db8:10::10), 30 hops max, 80 byte packets
1 ebgp_r20.r11 (2001:db8:10:6::11) 0.073 ms 0.019 ms 0.018 ms
2 lo.r10 (2001:db8:10::10) 0.320 ms 0.268 ms 0.245 ms
* Notice that this is actually a stupid way to prefer specific routes for traffic, because by disabling BGP sessions or by not announcing or not accepting routes, we reduce redundancy in the network, because the disabled paths also do not function as less-preferable path any more. See BGP route selection below for more thoughts about this.
## Bonus material
Now you've learned the basics of building a network with BGP, you should be able to better understand the average "BGP introduction" page published on the Internet that immediately tries to overwhelm you with technical terms instead of just providing an example. ;-]
The following topics are also a minimal set of hints for further study:
### Peering and Transit
In the above examples, we've already seen the difference between just connecting two networks so they can reach each other, and on top of that, forwarding route announcements, so that a network can act as transit area for traffic between two other networks. These concepts are called "Peering" and "Transit" and if you search for them, you should be able to find a lot more information.
The next page of this tutorial, [Routing on the internet](/the-internet/README.md), will be about discovering the fact that with the limited knowledge we have now, it's already possible to understand how the whole internet works together. The page hasn't been written yet, but will show how networks of ISPs, Transit Providers and Internet Exchanges connect the whole world together and how you can find out using tools on the internet how they're connected, and how they provide a path between your own computer to every remote destination on the internet you're connecting to every day.
### BGP route selection
As hinted above ("Notice that this is actually a stupid way...") there are smarter ways to prefer specific network paths for specific routes than to just disable a path or stop accepting announcements. BGP has a bunch of knobs to adjust that you can combine to create a routing policy inside your own AS.
In the BIRD documentation about BGP, you can find a list about "Route selection rules" that BIRD applies to select which BGP route to a particular destination will be chosen if multiple ones are available for the same prefix:
* Prefer route with the highest Local Preference attribute.
* Prefer route with the shortest AS path.
* Prefer IGP origin over EGP and EGP origin over incomplete.
* Prefer the lowest value of the Multiple Exit Discriminator.
* Prefer routes received via eBGP over ones received via iBGP.
* Prefer routes with lower internal distance to a boundary router.
* Prefer the route with the lowest value of router ID of the advertising router.
Using other resources on the internet you should be able to find out what all of these mean. Using the BIRD documentation, you can change the configuration of all routers in our example network to route traffic around in different ways using these options.
Within a single AS, it's really important to have a single policy, so that all routers are on the same page about where to send traffic. You cannot have two border routers, which independently from each other determine that the other one should be used as exit point to a specific external peer. They would pingpong all traffic between them until the IP packet TTL expires and then drop the traffic, resulting in a big black hole and a bunch of overloaded internal connections. So, yes, this can get quite complex quickly if you start to make customizations.
Remember that we started this tutorial with an example network in which traffic between `AS65000` and `AS65010` was already using the two paths between them in an asymmetric way. Because the setup of both networks is so similar and mirrored, the fact that traffic back and forth flows asymmetrically is actually thanks to the last rule: "Prefer the route with the lowest value of router ID of the advertising router.". After initially setting up the example, I had to swap `R10` and `R11` again to get this behaviour. :-)

Binary file not shown.

Binary file not shown.

After

Width:  |  Height:  |  Size: 36 KiB

Binary file not shown.

Binary file not shown.

After

Width:  |  Height:  |  Size: 47 KiB

Binary file not shown.

Binary file not shown.

After

Width:  |  Height:  |  Size: 48 KiB

Binary file not shown.

Binary file not shown.

After

Width:  |  Height:  |  Size: 45 KiB

Binary file not shown.

Binary file not shown.

After

Width:  |  Height:  |  Size: 85 KiB

Binary file not shown.

View File

@@ -0,0 +1,88 @@
router id 10.0.0.0;
log "/var/log/bird/bird6.log" all;
debug protocols { states, routes, filters, interfaces }
protocol kernel {
import none;
export all;
}
protocol device {
# defaults...
}
protocol ospf {
area 0 {
# BIRD ignores the IPv6 lo because it has no link local address
stubnet 2001:db8::ff/128;
interface "lan" {
};
interface "ebgp_r11" {
stub;
};
};
}
protocol static {
import all;
route 2001:db8::/48 blackhole;
}
##############################################################################
# BGP table
#
# Use this routing table to gather external routes received via BGP which we
# want push to the kernel via our master table and to other routers in our AS
# via iBGP or even to other routers outside our AS again (transit), which can
# be connected here or to a router elsewhere on the border of our AS.
table t_bgp;
protocol pipe p_master_to_bgp {
table master;
peer table t_bgp;
import all; # default
export none; # default
}
##############################################################################
# eBGP R11
#
table t_r11;
protocol static originate_to_r11 {
table t_r11;
import all; # originate here
route 2001:db8::/48 blackhole;
}
protocol bgp ebgp_r11 {
table t_r11;
local 2001:db8:0:3::ff as 65000;
neighbor 2001:db8:0:3::11 as 65010;
import all;
export all;
}
protocol pipe p_bgp_to_r11 {
table t_bgp;
peer table t_r11;
import where proto = "ebgp_r11";
export none;
}
##############################################################################
# iBGP
#
protocol bgp ibgp_r2 {
table t_bgp;
igp table master;
import none;
export all;
local 2001:db8::ff as 65000;
neighbor 2001:db8::2 as 65000;
}

View File

@@ -0,0 +1,32 @@
127.0.0.1 localhost
::1 localhost ip6-localhost ip6-loopback
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
2001:db8::ff lo.r0 r0
2001:db8:0:1::ff lan.r0
2001:db8:0:3::ff ebgp_r11.r0
2001:db8:10::10 lo.r10 r10
2001:db8:10:2::10 lan.r10
2001:db8:10:4::10 ebgp_r1.r10
2001:db8:10::11 lo.r11 r11
2001:db8:10:2::11 lan.r11
2001:db8:0:3::11 ebgp_r0.r11
2001:db8:10:6::11 ebgp_r20.r11
2001:db8:10::12 lo.r12 r12
2001:db8:10:2::12 lan.r12
2001:db8::1 lo.r1 r1
2001:db8:0:1::1 lan.r1
2001:db8:10:4::1 ebgp_r10.r1
2001:db8:0:5::1 ebgp_r20.r1
2001:db8:20::20 lo.r20 r20
2001:db8:0:5::20 ebgp_r1.r20
2001:db8:10:6::20 ebgp_r11.r20
2001:db8::2 lo.r2 r2
2001:db8:0:1::2 lan.r2

View File

@@ -0,0 +1,18 @@
auto lo
iface lo inet loopback
up ip addr add 2001:db8::ff/128 dev lo
down ip addr del 2001:db8::ff/128 dev lo
auto lan
iface lan inet manual
up ip link set up dev lan
up ip addr add 2001:db8:0:1::ff/120 dev lan
down ip addr del 2001:db8:0:1::ff/120 dev lan
down ip link set down dev lan
auto ebgp_r11
iface ebgp_r11 inet manual
up ip link set up dev ebgp_r11
up ip addr add 2001:db8:0:3::ff/120 dev ebgp_r11
down ip addr del 2001:db8:0:3::ff/120 dev ebgp_r11
down ip link set down dev ebgp_r11

View File

@@ -0,0 +1,118 @@
router id 10.0.0.1;
log "/var/log/bird/bird6.log" all;
debug protocols { states, routes, filters, interfaces }
protocol kernel {
import none;
export all;
}
protocol device {
# defaults...
}
protocol ospf {
area 0 {
# BIRD ignores the IPv6 lo because it has no link local address
stubnet 2001:db8::1/128;
interface "lan" {
};
interface "ebgp_r10" {
stub;
};
interface "ebgp_r20" {
stub;
};
};
}
protocol static {
import all;
route 2001:db8::/48 blackhole;
}
##############################################################################
# BGP table
#
# Use this routing table to gather external routes received via BGP which we
# want push to the kernel via our master table and to other routers in our AS
# via iBGP or even to other routers outside our AS again (transit), which can
# be connected here or to a router elsewhere on the border of our AS.
table t_bgp;
protocol pipe p_master_to_bgp {
table master;
peer table t_bgp;
import all; # default
export none; # default
}
##############################################################################
# eBGP R10
#
table t_r10;
protocol static originate_to_r10 {
table t_r10;
import all; # originate here
route 2001:db8::/48 blackhole;
}
protocol bgp ebgp_r10 {
table t_r10;
local 2001:db8:10:4::1 as 65000;
neighbor 2001:db8:10:4::10 as 65010;
import all;
export all;
}
protocol pipe p_bgp_to_r10 {
table t_bgp;
peer table t_r10;
import where proto = "ebgp_r10";
export none;
}
##############################################################################
# eBGP R20
#
table t_r20;
protocol static originate_to_r20 {
table t_r20;
import all; # originate here
route 2001:db8::/48 blackhole;
}
protocol bgp ebgp_r20 {
table t_r20;
local 2001:db8:0:5::1 as 65000;
neighbor 2001:db8:0:5::20 as 65020;
import all;
export all;
}
protocol pipe p_bgp_to_r20 {
table t_bgp;
peer table t_r20;
import where proto = "ebgp_r20";
export none;
}
##############################################################################
# iBGP
#
protocol bgp ibgp_r2 {
table t_bgp;
igp table master;
import none;
export all;
local 2001:db8::1 as 65000;
neighbor 2001:db8::2 as 65000;
}

View File

@@ -0,0 +1,32 @@
127.0.0.1 localhost
::1 localhost ip6-localhost ip6-loopback
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
2001:db8::ff lo.r0 r0
2001:db8:0:1::ff lan.r0
2001:db8:0:3::ff ebgp_r11.r0
2001:db8:10::10 lo.r10 r10
2001:db8:10:2::10 lan.r10
2001:db8:10:4::10 ebgp_r1.r10
2001:db8:10::11 lo.r11 r11
2001:db8:10:2::11 lan.r11
2001:db8:0:3::11 ebgp_r0.r11
2001:db8:10:6::11 ebgp_r20.r11
2001:db8:10::12 lo.r12 r12
2001:db8:10:2::12 lan.r12
2001:db8::1 lo.r1 r1
2001:db8:0:1::1 lan.r1
2001:db8:10:4::1 ebgp_r10.r1
2001:db8:0:5::1 ebgp_r20.r1
2001:db8:20::20 lo.r20 r20
2001:db8:0:5::20 ebgp_r1.r20
2001:db8:10:6::20 ebgp_r11.r20
2001:db8::2 lo.r2 r2
2001:db8:0:1::2 lan.r2

View File

@@ -0,0 +1,25 @@
auto lo
iface lo inet loopback
up ip addr add 2001:db8::1/128 dev lo
down ip addr del 2001:db8::1/128 dev lo
auto lan
iface lan inet manual
up ip link set up dev lan
up ip addr add 2001:db8:0:1::1/120 dev lan
down ip addr del 2001:db8:0:1::1/120 dev lan
down ip link set down dev lan
auto ebgp_r10
iface ebgp_r10 inet manual
up ip link set up dev ebgp_r10
up ip addr add 2001:db8:10:4::1/120 dev ebgp_r10
down ip addr del 2001:db8:10:4::1/120 dev ebgp_r10
down ip link set down dev ebgp_r10
auto ebgp_r20
iface ebgp_r20 inet manual
up ip link set up dev ebgp_r20
up ip addr add 2001:db8:0:5::1/120 dev ebgp_r20
down ip addr del 2001:db8:0:5::1/120 dev ebgp_r20
down ip link set down dev ebgp_r20

View File

@@ -0,0 +1,88 @@
router id 10.0.0.10;
log "/var/log/bird/bird6.log" all;
debug protocols { states, routes, filters, interfaces }
protocol kernel {
import none;
export all;
}
protocol device {
# defaults...
}
protocol ospf {
area 0 {
# BIRD ignores the IPv6 lo because it has no link local address
stubnet 2001:db8:10::10/128;
interface "lan" {
};
interface "ebgp_r1" {
stub;
};
};
}
protocol static {
import all;
route 2001:db8:10::/48 blackhole;
}
##############################################################################
# BGP table
#
# Use this routing table to gather external routes received via BGP which we
# want push to the kernel via our master table and to other routers in our AS
# via iBGP or even to other routers outside our AS again (transit), which can
# be connected here or to a router elsewhere on the border of our AS.
table t_bgp;
protocol pipe p_master_to_bgp {
table master;
peer table t_bgp;
import all; # default
export none; # default
}
##############################################################################
# eBGP R1
#
table t_r1;
protocol static originate_to_r1 {
table t_r1;
import all; # originate here
route 2001:db8:10::/48 blackhole;
}
protocol bgp ebgp_r1 {
table t_r1;
local 2001:db8:10:4::10 as 65010;
neighbor 2001:db8:10:4::1 as 65000;
import all;
export all;
}
protocol pipe p_bgp_to_r1 {
table t_bgp;
peer table t_r1;
import where proto = "ebgp_r1";
export none;
}
##############################################################################
# iBGP
#
protocol bgp ibgp_r12 {
table t_bgp;
igp table master;
import none;
export all;
local 2001:db8:10::10 as 65010;
neighbor 2001:db8:10::12 as 65010;
}

View File

@@ -0,0 +1,32 @@
127.0.0.1 localhost
::1 localhost ip6-localhost ip6-loopback
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
2001:db8::ff lo.r0 r0
2001:db8:0:1::ff lan.r0
2001:db8:0:3::ff ebgp_r11.r0
2001:db8:10::10 lo.r10 r10
2001:db8:10:2::10 lan.r10
2001:db8:10:4::10 ebgp_r1.r10
2001:db8:10::11 lo.r11 r11
2001:db8:10:2::11 lan.r11
2001:db8:0:3::11 ebgp_r0.r11
2001:db8:10:6::11 ebgp_r20.r11
2001:db8:10::12 lo.r12 r12
2001:db8:10:2::12 lan.r12
2001:db8::1 lo.r1 r1
2001:db8:0:1::1 lan.r1
2001:db8:10:4::1 ebgp_r10.r1
2001:db8:0:5::1 ebgp_r20.r1
2001:db8:20::20 lo.r20 r20
2001:db8:0:5::20 ebgp_r1.r20
2001:db8:10:6::20 ebgp_r11.r20
2001:db8::2 lo.r2 r2
2001:db8:0:1::2 lan.r2

View File

@@ -0,0 +1,18 @@
auto lo
iface lo inet loopback
up ip addr add 2001:db8:10::10/128 dev lo
down ip addr del 2001:db8:10::10/128 dev lo
auto lan
iface lan inet manual
up ip link set up dev lan
up ip addr add 2001:db8:10:2::10/120 dev lan
down ip addr del 2001:db8:10:2::10/120 dev lan
down ip link set down dev lan
auto ebgp_r1
iface ebgp_r1 inet manual
up ip link set up dev ebgp_r1
up ip addr add 2001:db8:10:4::10/120 dev ebgp_r1
down ip addr del 2001:db8:10:4::10/120 dev ebgp_r1
down ip link set down dev ebgp_r1

View File

@@ -0,0 +1,118 @@
router id 10.0.0.11;
log "/var/log/bird/bird6.log" all;
debug protocols { states, routes, filters, interfaces }
protocol kernel {
import none;
export all;
}
protocol device {
# defaults...
}
protocol ospf {
area 0 {
# BIRD ignores the IPv6 lo because it has no link local address
stubnet 2001:db8:10::11/128;
interface "lan" {
};
interface "ebgp_r0" {
stub;
};
interface "ebgp_r20" {
stub;
};
};
}
protocol static {
import all;
route 2001:db8:10::/48 blackhole;
}
##############################################################################
# BGP table
#
# Use this routing table to gather external routes received via BGP which we
# want push to the kernel via our master table and to other routers in our AS
# via iBGP or even to other routers outside our AS again (transit), which can
# be connected here or to a router elsewhere on the border of our AS.
table t_bgp;
protocol pipe p_master_to_bgp {
table master;
peer table t_bgp;
import all; # default
export none; # default
}
##############################################################################
# eBGP R0
#
table t_r0;
protocol static originate_to_r0 {
table t_r0;
import all; # originate here
route 2001:db8:10::/48 blackhole;
}
protocol bgp ebgp_r0 {
table t_r0;
local 2001:db8:0:3::11 as 65010;
neighbor 2001:db8:0:3::ff as 65000;
import all;
export all;
}
protocol pipe p_bgp_to_r0 {
table t_bgp;
peer table t_r0;
import where proto = "ebgp_r0";
export none;
}
##############################################################################
# eBGP R20
#
table t_r20;
protocol static originate_to_r20 {
table t_r20;
import all; # originate here
route 2001:db8:10::/48 blackhole;
}
protocol bgp ebgp_r20 {
table t_r20;
local 2001:db8:10:6::11 as 65010;
neighbor 2001:db8:10:6::20 as 65020;
import all;
export all;
}
protocol pipe p_bgp_to_r20 {
table t_bgp;
peer table t_r20;
import where proto = "ebgp_r20";
export none;
}
##############################################################################
# iBGP
#
protocol bgp ibgp_r12 {
table t_bgp;
igp table master;
import none;
export all;
local 2001:db8:10::11 as 65010;
neighbor 2001:db8:10::12 as 65010;
}

View File

@@ -0,0 +1,32 @@
127.0.0.1 localhost
::1 localhost ip6-localhost ip6-loopback
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
2001:db8::ff lo.r0 r0
2001:db8:0:1::ff lan.r0
2001:db8:0:3::ff ebgp_r11.r0
2001:db8:10::10 lo.r10 r10
2001:db8:10:2::10 lan.r10
2001:db8:10:4::10 ebgp_r1.r10
2001:db8:10::11 lo.r11 r11
2001:db8:10:2::11 lan.r11
2001:db8:0:3::11 ebgp_r0.r11
2001:db8:10:6::11 ebgp_r20.r11
2001:db8:10::12 lo.r12 r12
2001:db8:10:2::12 lan.r12
2001:db8::1 lo.r1 r1
2001:db8:0:1::1 lan.r1
2001:db8:10:4::1 ebgp_r10.r1
2001:db8:0:5::1 ebgp_r20.r1
2001:db8:20::20 lo.r20 r20
2001:db8:0:5::20 ebgp_r1.r20
2001:db8:10:6::20 ebgp_r11.r20
2001:db8::2 lo.r2 r2
2001:db8:0:1::2 lan.r2

View File

@@ -0,0 +1,25 @@
auto lo
iface lo inet loopback
up ip addr add 2001:db8:10::11/128 dev lo
down ip addr del 2001:db8:10::11/128 dev lo
auto lan
iface lan inet manual
up ip link set up dev lan
up ip addr add 2001:db8:10:2::11/120 dev lan
down ip addr del 2001:db8:10:2::11/120 dev lan
down ip link set down dev lan
auto ebgp_r0
iface ebgp_r0 inet manual
up ip link set up dev ebgp_r0
up ip addr add 2001:db8:0:3::11/120 dev ebgp_r0
down ip addr del 2001:db8:0:3::11/120 dev ebgp_r0
down ip link set down dev ebgp_r0
auto ebgp_r20
iface ebgp_r20 inet manual
up ip link set up dev ebgp_r20
up ip addr add 2001:db8:10:6::11/120 dev ebgp_r20
down ip addr del 2001:db8:10:6::11/120 dev ebgp_r20
down ip link set down dev ebgp_r20

View File

@@ -0,0 +1,45 @@
router id 10.0.0.12;
log "/var/log/bird/bird6.log" all;
debug protocols { states, routes, filters, interfaces }
protocol kernel {
import none;
export all;
}
protocol device {
# defaults...
}
protocol ospf {
area 0 {
# BIRD ignores the IPv6 lo because it has no link local address
stubnet 2001:db8:10::12/128;
interface "lan" {
};
};
}
protocol static {
import all;
route 2001:db8:10::/48 blackhole;
}
##############################################################################
# iBGP
#
protocol bgp ibgp_r10 {
import all;
export none;
local 2001:db8:10::12 as 65010;
neighbor 2001:db8:10::10 as 65010;
}
protocol bgp ibgp_r11 {
import all;
export none;
local 2001:db8:10::12 as 65010;
neighbor 2001:db8:10::11 as 65010;
}

View File

@@ -0,0 +1,32 @@
127.0.0.1 localhost
::1 localhost ip6-localhost ip6-loopback
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
2001:db8::ff lo.r0 r0
2001:db8:0:1::ff lan.r0
2001:db8:0:3::ff ebgp_r11.r0
2001:db8:10::10 lo.r10 r10
2001:db8:10:2::10 lan.r10
2001:db8:10:4::10 ebgp_r1.r10
2001:db8:10::11 lo.r11 r11
2001:db8:10:2::11 lan.r11
2001:db8:0:3::11 ebgp_r0.r11
2001:db8:10:6::11 ebgp_r20.r11
2001:db8:10::12 lo.r12 r12
2001:db8:10:2::12 lan.r12
2001:db8::1 lo.r1 r1
2001:db8:0:1::1 lan.r1
2001:db8:10:4::1 ebgp_r10.r1
2001:db8:0:5::1 ebgp_r20.r1
2001:db8:20::20 lo.r20 r20
2001:db8:0:5::20 ebgp_r1.r20
2001:db8:10:6::20 ebgp_r11.r20
2001:db8::2 lo.r2 r2
2001:db8:0:1::2 lan.r2

View File

@@ -0,0 +1,11 @@
auto lo
iface lo inet loopback
up ip addr add 2001:db8:10::12/128 dev lo
down ip addr del 2001:db8:10::12/128 dev lo
auto lan
iface lan inet manual
up ip link set up dev lan
up ip addr add 2001:db8:10:2::12/120 dev lan
down ip addr del 2001:db8:10:2::12/120 dev lan
down ip link set down dev lan

View File

@@ -0,0 +1,45 @@
router id 10.0.0.2;
log "/var/log/bird/bird6.log" all;
debug protocols { states, routes, filters, interfaces }
protocol kernel {
import none;
export all;
}
protocol device {
# defaults...
}
protocol ospf {
area 0 {
# BIRD ignores the IPv6 lo because it has no link local address
stubnet 2001:db8::2/128;
interface "lan" {
};
};
}
protocol static {
import all;
route 2001:db8::/48 blackhole;
}
##############################################################################
# iBGP
#
protocol bgp ibgp_r0 {
import all;
export none;
local 2001:db8::2 as 65000;
neighbor 2001:db8::ff as 65000;
}
protocol bgp ibgp_r1 {
import all;
export none;
local 2001:db8::2 as 65000;
neighbor 2001:db8::1 as 65000;
}

View File

@@ -0,0 +1,32 @@
127.0.0.1 localhost
::1 localhost ip6-localhost ip6-loopback
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
2001:db8::ff lo.r0 r0
2001:db8:0:1::ff lan.r0
2001:db8:0:3::ff ebgp_r11.r0
2001:db8:10::10 lo.r10 r10
2001:db8:10:2::10 lan.r10
2001:db8:10:4::10 ebgp_r1.r10
2001:db8:10::11 lo.r11 r11
2001:db8:10:2::11 lan.r11
2001:db8:0:3::11 ebgp_r0.r11
2001:db8:10:6::11 ebgp_r20.r11
2001:db8:10::12 lo.r12 r12
2001:db8:10:2::12 lan.r12
2001:db8::1 lo.r1 r1
2001:db8:0:1::1 lan.r1
2001:db8:10:4::1 ebgp_r10.r1
2001:db8:0:5::1 ebgp_r20.r1
2001:db8:20::20 lo.r20 r20
2001:db8:0:5::20 ebgp_r1.r20
2001:db8:10:6::20 ebgp_r11.r20
2001:db8::2 lo.r2 r2
2001:db8:0:1::2 lan.r2

View File

@@ -0,0 +1,11 @@
auto lo
iface lo inet loopback
up ip addr add 2001:db8::2/128 dev lo
down ip addr del 2001:db8::2/128 dev lo
auto lan
iface lan inet manual
up ip link set up dev lan
up ip addr add 2001:db8:0:1::2/120 dev lan
down ip addr del 2001:db8:0:1::2/120 dev lan
down ip link set down dev lan

View File

@@ -0,0 +1,103 @@
router id 10.0.0.20;
log "/var/log/bird/bird6.log" all;
debug protocols { states, routes, filters, interfaces }
protocol kernel {
import none;
export all;
}
protocol device {
# defaults...
}
protocol ospf {
area 0 {
# BIRD ignores the IPv6 lo because it has no link local address
stubnet 2001:db8:20::20/128;
interface "ebgp_r1" {
stub;
};
interface "ebgp_r11" {
stub;
};
};
}
protocol static {
import all;
route 2001:db8:20::/48 blackhole;
}
##############################################################################
# BGP table
#
# Use this routing table to gather external routes received via BGP which we
# want push to the kernel via our master table and to other routers in our AS
# via iBGP or even to other routers outside our AS again (transit), which can
# be connected here or to a router elsewhere on the border of our AS.
table t_bgp;
protocol pipe p_master_to_bgp {
table master;
peer table t_bgp;
import all; # default
export none; # default
}
##############################################################################
# eBGP R1
#
table t_r1;
protocol static originate_to_r1 {
table t_r1;
import all; # originate here
route 2001:db8:20::/48 blackhole;
}
protocol bgp ebgp_r1 {
table t_r1;
local 2001:db8:0:5::20 as 65020;
neighbor 2001:db8:0:5::1 as 65000;
import all;
export all;
}
protocol pipe p_bgp_to_r1 {
table t_bgp;
peer table t_r1;
import where proto = "ebgp_r1";
export none;
}
##############################################################################
# eBGP R11
#
table t_r11;
protocol static originate_to_r11 {
table t_r11;
import all; # originate here
route 2001:db8:20::/48 blackhole;
}
protocol bgp ebgp_r11 {
table t_r11;
local 2001:db8:10:6::20 as 65020;
neighbor 2001:db8:10:6::11 as 65010;
import all;
export all;
}
protocol pipe p_bgp_to_r11 {
table t_bgp;
peer table t_r11;
import where proto = "ebgp_r11";
export none;
}

View File

@@ -0,0 +1,32 @@
127.0.0.1 localhost
::1 localhost ip6-localhost ip6-loopback
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
2001:db8::ff lo.r0 r0
2001:db8:0:1::ff lan.r0
2001:db8:0:3::ff ebgp_r11.r0
2001:db8:10::10 lo.r10 r10
2001:db8:10:2::10 lan.r10
2001:db8:10:4::10 ebgp_r1.r10
2001:db8:10::11 lo.r11 r11
2001:db8:10:2::11 lan.r11
2001:db8:0:3::11 ebgp_r0.r11
2001:db8:10:6::11 ebgp_r20.r11
2001:db8:10::12 lo.r12 r12
2001:db8:10:2::12 lan.r12
2001:db8::1 lo.r1 r1
2001:db8:0:1::1 lan.r1
2001:db8:10:4::1 ebgp_r10.r1
2001:db8:0:5::1 ebgp_r20.r1
2001:db8:20::20 lo.r20 r20
2001:db8:0:5::20 ebgp_r1.r20
2001:db8:10:6::20 ebgp_r11.r20
2001:db8::2 lo.r2 r2
2001:db8:0:1::2 lan.r2

View File

@@ -0,0 +1,18 @@
auto lo
iface lo inet loopback
up ip addr add 2001:db8:20::20/128 dev lo
down ip addr del 2001:db8:20::20/128 dev lo
auto ebgp_r1
iface ebgp_r1 inet manual
up ip link set up dev ebgp_r1
up ip addr add 2001:db8:0:5::20/120 dev ebgp_r1
down ip addr del 2001:db8:0:5::20/120 dev ebgp_r1
down ip link set down dev ebgp_r1
auto ebgp_r11
iface ebgp_r11 inet manual
up ip link set up dev ebgp_r11
up ip addr add 2001:db8:10:6::20/120 dev ebgp_r11
down ip addr del 2001:db8:10:6::20/120 dev ebgp_r11
down ip link set down dev ebgp_r11

123
bgp-contd/lxc/fixnetwork.sh Normal file
View File

@@ -0,0 +1,123 @@
#!/bin/sh
sed -i '/lxc.network/d' R*/config
cat <<EOF >> R0/config
lxc.network.type = veth
lxc.network.flags = up
lxc.network.name = lan
lxc.network.veth.pair = r0.1
lxc.network.script.up = /etc/lxc/lxc-openvswitch
lxc.network.script.down = /etc/lxc/lxc-openvswitch
lxc.network.type = veth
lxc.network.flags = up
lxc.network.name = ebgp_r11
lxc.network.veth.pair = r0.3
lxc.network.script.up = /etc/lxc/lxc-openvswitch
lxc.network.script.down = /etc/lxc/lxc-openvswitch
EOF
cat <<EOF >> R1/config
lxc.network.type = veth
lxc.network.flags = up
lxc.network.name = lan
lxc.network.veth.pair = r1.1
lxc.network.script.up = /etc/lxc/lxc-openvswitch
lxc.network.script.down = /etc/lxc/lxc-openvswitch
lxc.network.type = veth
lxc.network.flags = up
lxc.network.name = ebgp_r10
lxc.network.veth.pair = r1.4
lxc.network.script.up = /etc/lxc/lxc-openvswitch
lxc.network.script.down = /etc/lxc/lxc-openvswitch
lxc.network.type = veth
lxc.network.flags = up
lxc.network.name = ebgp_r20
lxc.network.veth.pair = r1.5
lxc.network.script.up = /etc/lxc/lxc-openvswitch
lxc.network.script.down = /etc/lxc/lxc-openvswitch
EOF
cat <<EOF >> R2/config
lxc.network.type = veth
lxc.network.flags = up
lxc.network.name = lan
lxc.network.veth.pair = r2.1
lxc.network.script.up = /etc/lxc/lxc-openvswitch
lxc.network.script.down = /etc/lxc/lxc-openvswitch
EOF
cat <<EOF >> R10/config
lxc.network.type = veth
lxc.network.flags = up
lxc.network.name = lan
lxc.network.veth.pair = r10.2
lxc.network.script.up = /etc/lxc/lxc-openvswitch
lxc.network.script.down = /etc/lxc/lxc-openvswitch
lxc.network.type = veth
lxc.network.flags = up
lxc.network.name = ebgp_r1
lxc.network.veth.pair = r10.4
lxc.network.script.up = /etc/lxc/lxc-openvswitch
lxc.network.script.down = /etc/lxc/lxc-openvswitch
EOF
cat <<EOF >> R11/config
lxc.network.type = veth
lxc.network.flags = up
lxc.network.name = lan
lxc.network.veth.pair = r11.2
lxc.network.script.up = /etc/lxc/lxc-openvswitch
lxc.network.script.down = /etc/lxc/lxc-openvswitch
lxc.network.type = veth
lxc.network.flags = up
lxc.network.name = ebgp_r0
lxc.network.veth.pair = r11.3
lxc.network.script.up = /etc/lxc/lxc-openvswitch
lxc.network.script.down = /etc/lxc/lxc-openvswitch
lxc.network.type = veth
lxc.network.flags = up
lxc.network.name = ebgp_r20
lxc.network.veth.pair = r11.6
lxc.network.script.up = /etc/lxc/lxc-openvswitch
lxc.network.script.down = /etc/lxc/lxc-openvswitch
EOF
cat <<EOF >> R12/config
lxc.network.type = veth
lxc.network.flags = up
lxc.network.name = lan
lxc.network.veth.pair = r12.2
lxc.network.script.up = /etc/lxc/lxc-openvswitch
lxc.network.script.down = /etc/lxc/lxc-openvswitch
EOF
cat <<EOF >> R20/config
lxc.network.type = veth
lxc.network.flags = up
lxc.network.name = ebgp_r1
lxc.network.veth.pair = r20.5
lxc.network.script.up = /etc/lxc/lxc-openvswitch
lxc.network.script.down = /etc/lxc/lxc-openvswitch
lxc.network.type = veth
lxc.network.flags = up
lxc.network.name = ebgp_r11
lxc.network.veth.pair = r20.6
lxc.network.script.up = /etc/lxc/lxc-openvswitch
lxc.network.script.down = /etc/lxc/lxc-openvswitch
EOF

531
bgp-intro/README.md Normal file
View File

@@ -0,0 +1,531 @@
BGP, Part I
===========
In the previous tutorial, we discovered how to let [OSPF](/ospf-intro/README.md) dynamically configure routing inside a network. This tutorial provides an introduction to another routing protocol, which is BGP, the Border Gateway Protocol. As the name implies, this protocol acts on the border of a network. Where OSPF is well suited to keep track of all tiny details of what's happening in our internal network, BGP will be talking to the outside world to interconnect our network with other networks, managed by someone else.
## BGP Essentials
When routers talk BGP to each other, they essentially just claim that network ranges are reachable via them:
![BGP network, barebones](/bgp-intro/bgp-heythere.png)
Let's look at the same picture again, hiding less information:
![BGP network, less simplified](/bgp-intro/bgp-hey2.png)
The picture shows two networks, which are interconnected through router `R3` and `R10`.
* A complete network under control of somebody has an AS ([Autonomous System](https://tools.ietf.org/html/rfc1930)) number. This number will be used later in the BIRD BGP configuration.
* The routes that are published to another network are as aggregated as possible, to minimize the amount of them. While the internal routing table in for example `AS64080` might contain dozens of prefixes, for each little vlan, and probably a number of single host routes (IPv4 `/32` and IPv6 `/128`), they're advertised to the outside as just three routes in total.
* If neighbouring routers between different networks are directly connected, they often interconnect using a minimal sized network range. For IPv4, this is usually a `/30` and for IPv6 a `/120` or a `/126` prefix, containing only the two routers. In the example above, the small network ranges are taken from the network of `AS64080`.
## OSPF vs. BGP
While the title of this section might seem logical, since we're considering BGP after just having spent quite some time on OSPF, it's actually a non-issue. OSPF and BGP are two very different routing protocols, which are used to get different things done. Nonetheless, let's look at some differences:
OSPF:
* Routes in the network are originated by just putting ip addresses on a network interface of a router, and letting the routing protocol pick them up automatically.
* The routes in OSPF are addresses and subnets that are actually in use.
* Every router that participates in the OSPF protocol has a full detailed view on the network using link state updates that are broadcasted over the network. This knowledge is used to calculate the shortest path to every part of the network.
BGP:
* Routes that are published to other networks are "umbrella ranges", which are as big as possible and are defined manually.
* There is no actual proof that the addresses which are advertised are actually in use inside the network.
* A neighbour BGP router knows that some prefix is reachable via another network, but where OSPF shortest path deals with knowledge about all separate routers, paths and weights, BGP just looks on a higher level, considering a complete network (AS) being one step. By default BGP also tries to forward traffic into the direction that contains the smallest amount of AS-hops to a destination (the shortest AS-path), but BGP provides a fair amount of configurable options to influence the routing decisions.
So, OSPF is an IGP (Interior Gateway Protocol) and BGP is an EGP (Exterior Gateway Protocol). BGP can connect OSPF networks to each other, hiding a lot of detail inside them.
## BGP and OSPF with BIRD: Setting up the containers and networks
In the second half of this tutorial we'll configure a network, using OSPF, BGP and the BIRD routing software. BGP wise, it's kept simple, using just a single connection between two networks.
![BGP and OSPF network](/bgp-intro/bgp-ospf.png)
Our networks start to look serious now! It might be handy to print this image so you don't have to scroll back up all the time, comparing all the routes in the output of commands with the network topology.
Thankfully, most of the configuration is provided already, so we can quickly set up this whole network using our LXC environment. Just like in the previous tutorial, the birdbase container can be cloned, after which the lxc network information and configuration inside the containers can be copied into them.
1. Clone this git repository somewhere to be able to use some files from the bgp-intro/lxc/ directory inside.
2. lxc-clone the birdbase container several times:
lxc-clone -s birdbase R0
lxc-clone -s birdbase R1
lxc-clone -s birdbase R3
lxc-clone -s birdbase R10
lxc-clone -s birdbase R11
lxc-clone -s birdbase R12
lxc-clone -s birdbase H6
lxc-clone -s birdbase H7
lxc-clone -s birdbase H19
lxc-clone -s birdbase H34
3. Set up the network interfaces in the lxc configuration. This can be done by removing all network related configuration that remains from the cloned birdbase container, and then appending all needed interface configuration by running the fixnetwork.sh script that can be found in `bgp-intro/lxc/` in this git repository. Of course, have a look at the contents of the script first, before executing it.
. ./fixnetwork.sh
4. Copy extra configuration into the containers. The bgp-intro/lxc/ directory inside this git repository contains a little file hierarchy that can just be copied over the configuration of the containers. For each router, it's a network/interfaces configuration file which adds an IP address that corresponds with the Router ID to the loopback interface, and a simple BIRD configuration file that serves as a starting point for our next steps.
5. Start all containers
for router in 0 1 3 10 11 12; do lxc-start -d -n R$router; sleep 2; done
for host in 6 7 19 34; do lxc-start -d -n H$host; sleep 2; done
6. Verify connectivity and look around a bit. Here's an example for R1:
lxc-attach -n R1
root@R1:/# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet 10.40.217.1/32 scope global lo
valid_lft forever preferred_lft forever
inet6 2001:db8:40::1/128 scope global
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
109: vlan216: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 02:00:0a:28:d8:03 brd ff:ff:ff:ff:ff:ff
inet 10.40.216.3/28 brd 10.40.216.15 scope global vlan216
valid_lft forever preferred_lft forever
inet6 2001:db8:40:d8::3/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::aff:fe28:d803/64 scope link
valid_lft forever preferred_lft forever
111: vlan3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 02:00:0a:28:03:01 brd ff:ff:ff:ff:ff:ff
inet 10.40.3.1/24 brd 10.40.3.255 scope global vlan3
valid_lft forever preferred_lft forever
inet6 2001:db8:40:3::1/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::aff:fe28:301/64 scope link
valid_lft forever preferred_lft forever
root@R1:/# ip r
10.40.2.0/24 via 10.40.216.2 dev vlan216 proto bird
10.40.3.0/24 dev vlan3 proto kernel scope link src 10.40.3.1
10.40.216.0/28 dev vlan216 proto kernel scope link src 10.40.216.3
10.40.217.0 via 10.40.216.2 dev vlan216 proto bird
10.40.217.3 via 10.40.216.1 dev vlan216 proto bird
10.40.217.16/30 via 10.40.216.1 dev vlan216 proto bird
root@R1:/# birdc show route
BIRD 1.4.5 ready.
10.40.217.16/30 via 10.40.216.1 on vlan216 [ospf1 22:58:02] * I (150/20) [10.40.217.3]
10.40.216.0/28 dev vlan216 [ospf1 22:58:02] * I (150/10) [10.40.217.3]
10.40.217.0/32 via 10.40.216.2 on vlan216 [ospf1 22:58:02] * I (150/10) [10.40.217.0]
10.40.217.1/32 dev lo [ospf1 22:57:42] * I (150/0) [10.40.217.1]
10.40.217.3/32 via 10.40.216.1 on vlan216 [ospf1 22:58:02] * I (150/10) [10.40.217.3]
10.40.2.0/24 via 10.40.216.2 on vlan216 [ospf1 22:58:02] * I (150/20) [10.40.217.0]
10.40.3.0/24 dev vlan3 [ospf1 22:57:42] * I (150/10) [10.40.217.1]
root@R1:/# ip -6 r
2001:db8:40:: via fe80::aff:fe28:d802 dev vlan216 proto bird metric 1024
unreachable 2001:db8:40::1 dev lo proto kernel metric 256 error -101
2001:db8:40::3 via fe80::aff:fe28:d801 dev vlan216 proto bird metric 1024
2001:db8:40:2::/120 via fe80::aff:fe28:d802 dev vlan216 proto bird metric 1024
2001:db8:40:3::/120 dev vlan3 proto kernel metric 256
2001:db8:40:d8::/120 dev vlan216 proto kernel metric 256
2001:db8:40:d910::/120 via fe80::aff:fe28:d801 dev vlan216 proto bird metric 1024
fe80::/64 dev vlan216 proto kernel metric 256
fe80::/64 dev vlan3 proto kernel metric 256
root@R1:/# birdc6 show route
BIRD 1.4.5 ready.
2001:db8:40:d8::/120 dev vlan216 [ospf1 22:58:08] * I (150/10) [10.40.217.3]
2001:db8:40::/128 via fe80::aff:fe28:d802 on vlan216 [ospf1 22:58:08] * I (150/20) [10.40.217.0]
2001:db8:40:2::/120 via fe80::aff:fe28:d802 on vlan216 [ospf1 22:58:08] * I (150/20) [10.40.217.0]
2001:db8:40:3::/120 dev vlan3 [ospf1 22:57:41] * I (150/10) [10.40.217.1]
2001:db8:40::3/128 via fe80::aff:fe28:d801 on vlan216 [ospf1 22:58:08] * I (150/20) [10.40.217.3]
2001:db8:40:d910::/120 via fe80::aff:fe28:d801 on vlan216 [ospf1 22:58:08] * I (150/20) [10.40.217.3]
As you can see, OSPF is running for IPv4 and IPv6, and has discovered the complete internal network of `AS64080`.
Now make sure you can do the following, and answer the following questions:
* From H6, `traceroute -n` and `traceroute6 -n` to a few destinations in `AS64080` to get acquainted with the network topology.
* Look at the BIRD logging. A fun way to follow the logging is to do `tail -F R*/rootfs/var/log/bird/*.log` from outside the containers, and then start all of them.
* Find out why `10.40.217.18` or `2001:db8:40:d910::2` on `R10` cannot be pinged from `R1`, while the route to `10.40.217.16/30` and `2001:db8:40:d910::/120` are actually present in the routing table of `R1` and `R3`.
## BIRD BGP configuration
Let's zoom in a bit first, and focus on the connection between `R3` and `R10`. This section will show how to configure the actual BGP connection between those two routers, so they will learn about each others network.
![BGP and OSPF network, zoom in on R3, R10](/bgp-intro/bgp-ospf-zoom.png)
The routing table of `R3` contains information about the internal network of its own network, `AS64080`. As you can see, routes to the ranges in `AS65033` are missing.
root@R3:/# ip r
10.40.2.0/24 via 10.40.216.2 dev vlan216 proto bird
10.40.3.0/24 via 10.40.216.3 dev vlan216 proto bird
10.40.216.0/28 dev vlan216 proto kernel scope link src 10.40.216.1
10.40.217.0 via 10.40.216.2 dev vlan216 proto bird
10.40.217.1 via 10.40.216.3 dev vlan216 proto bird
10.40.217.16/30 dev vlan217 proto kernel scope link src 10.40.217.17
root@R3:/# ip -6 r
2001:db8:40:: via fe80::aff:fe28:d802 dev vlan216 proto bird metric 1024
2001:db8:40::1 via fe80::aff:fe28:d803 dev vlan216 proto bird metric 1024
unreachable 2001:db8:40::3 dev lo proto kernel metric 256 error -101
2001:db8:40:2::/120 via fe80::aff:fe28:d802 dev vlan216 proto bird metric 1024
2001:db8:40:3::/120 via fe80::aff:fe28:d803 dev vlan216 proto bird metric 1024
2001:db8:40:d8::/120 dev vlan216 proto kernel metric 256
2001:db8:40:d910::/120 dev vlan217 proto kernel metric 256
fe80::/64 dev vlan216 proto kernel metric 256
fe80::/64 dev vlan217 proto kernel metric 256
Now, add the following configuration to `bird.conf` of `R3`:
##############################################################################
# eBGP R10
#
table t_r10;
protocol static originate_to_r10 {
table t_r10;
import all; # originate here
route 10.40.0.0/22 blackhole;
route 10.40.216.0/21 blackhole;
}
protocol bgp ebgp_r10 {
table t_r10;
local 10.40.217.17 as 64080;
neighbor 10.40.217.18 as 65033;
import filter {
if net ~ [ 10.0.0.0/8{19,24} ] then accept;
reject;
};
import keep filtered on;
export where source = RTS_STATIC;
}
protocol pipe p_master_to_r10 {
table master;
peer table t_r10;
import where source = RTS_BGP;
export none;
}
### A closer look
Let me explain a bit about what's going on here. So far, we've used the BIRD protocol types `kernel`, `device` and `ospf`. This configuration snippet introduces three other ones: `static`, `bgp` and `pipe`. Besides that, there's also a table definition on top.
table t_r10;
By issuing `table t_r10`, we tell BIRD that we'd like to use an extra internal routing table with the name `t_r10`. By default, BIRD always has a routing table named `master`, and now we added a second one. Routing tables in BIRD are just a collection of routes, having some attributes.
protocol static originate_to_r10 {
table t_r10;
import all; # originate here
route 10.40.0.0/22 blackhole;
route 10.40.216.0/21 blackhole;
}
The static protocol is used to generate a collection of static routes. In this case, we define a protocol static with name `originate_to_r10`, and connect it to table `t_r10`. The import statement causes the routes that are generated by this static route protocol to be imported into the `t_r10` table. Static routes usually have a target of a neighbor router, using a via statement, but in this case, we don't care about a next hop, since it's just a collection of some prefixes that will be exported via BGP. The blackhole won't be actually used for anything here.
protocol bgp ebgp_r10 {
table t_r10;
local 10.40.217.17 as 64080;
neighbor 10.40.217.18 as 65033;
import filter {
if net ~ [ 10.0.0.0/8{19,24} ] then accept;
reject;
};
import keep filtered on;
export where source = RTS_STATIC;
}
The bgp protocol is named after the router which it's talking to, `R10`, and is also connected to the `t_r10` routing table inside BIRD. It has a local and remote IP address and AS number. The import rules are a bit more complex than a simple `import all`, which also would have been sufficient here to get it working. The filter shown here just makes sure only RFC1918 prefixes from `10/8` are accepted, which are allowed to be from a `/19` to `/24` in size each. The export rule contains a simple filter that tells BIRD to push all routes from table `t_r10` that originate from a static protocol to the outside, to `R10`.
protocol pipe p_master_to_r10 {
table master;
peer table t_r10;
import where source = RTS_BGP;
export none;
}
The pipe protocol is a simple protocol that is able to move around routes between internal BIRD routing tables. In this case, the pipe protocol `p_master_to_r10` is connected to the central `master` routing table and is looking at table `t_r10`. From table `t_r10`, all routes that originate from an external BGP peer are imported into the master table. Doing so will cause the routes that will be learned from the remote network to end up in the routing table of the Linux kernel (via the kernel protocol that exports them from the BIRD master table outside BIRD), while the routes that only were meant to be used to export to the BGP peer (generated by the static protocol) stay in `t_r10`.
Don't worry if the whole construction with tables, protocols and pipes is still a bit confusing. First goal is to see the BGP routing in action, and afterwards I'll explain more about those BIRD internals.
Also, remember that the internal BIRD routing tables are not used to actually do packet forwarding. During the OSPF tutorial, we already discussed this difference between the "Control Plane" and "Forwarding Plane". Actually, the routing table inside the control plane is usually called the "RIB" (Routing Information Base), while the routing table that is used in the forwarding plane is called the "FIB" (Forwarding Information Base). Just look up all those terms on the internet to see what everyone is saying about them.
### Seeing it in action!
After adding the configuration on `R3`, fire up the interactive BIRD console, using `birdc`:
root@R3:/# birdc
BIRD 1.4.5 ready.
bird>
Don't forget to tell BIRD to read and apply the changed configuration:
bird> con
Reading configuration from /etc/bird/bird.conf
Reconfigured
Now, the three new protocols should be shown:
bird> show protocols
name proto table state since info
kernel1 Kernel master up 2015-06-14
device1 Device master up 2015-06-14
ospf1 OSPF master up 2015-06-14 Running
originate_to_r10 Static t_r10 up 23:54:16
p_master_to_r10 Pipe master up 23:54:16 => t_r10
ebgp_r10 BGP t_r10 start 00:34:16 Active Socket: Connection refused
bird> show route table t_r10
10.40.216.0/21 blackhole [originate_to_r10 23:54:16] * (200)
10.40.0.0/22 blackhole [originate_to_r10 23:54:16] * (200)
Well, the routes are waiting to be pushed to `R10` in the `t_r10` table, and no routes from `AS65033` are visible yet. There's only an ugly "Connection refused"... reminding you that the other end of the BGP connection needs to be configured. Now it's up to you to configure `R10` with the opposite part of the configuration, and make it talk to `R3`!
When successful, the output of the commands above should show the BGP session to R3 as Established now:
bird> show protocols
name proto table state since info
kernel1 Kernel master up 2015-06-14
device1 Device master up 2015-06-14
ospf1 OSPF master up 2015-06-14 Running
originate_to_r3 Static t_r3 up 00:48:27
ebgp_r3 BGP t_r3 up 00:48:32 Established
p_master_to_r3 Pipe master up 00:48:27 => t_r3
Table `t_r3` now also contains the routes that are learned from `AS64080`:
bird> show route table t_r3
10.40.216.0/21 via 10.40.217.17 on vlan217 [ebgp_r3 00:48:32] * (100) [AS64080i]
10.40.32.0/19 blackhole [originate_to_r3 00:48:27] * (200)
10.40.0.0/22 via 10.40.217.17 on vlan217 [ebgp_r3 00:48:32] * (100) [AS64080i]
The above shows for example that prefix `10.40.216.0/21` was learned via the protocol `ebgp_r3`, 48 minutes ago, and that the range is originating from `AS64080`. The `via 10.40.217.17` is the BGP next-hop, which is the first router _outside_ our own network.
The BIRD master routing table also contains the routes learned over BGP, thanks to the `p_master_to_r3` protocol:
bird> show route
10.40.217.16/30 dev vlan217 [ospf1 2015-06-14] * I (150/10) [10.40.32.10]
10.40.216.0/21 via 10.40.217.17 on vlan217 [ebgp_r3 00:48:32] * (100) [AS64080i]
10.40.33.0/26 dev vlan33 [ospf1 2015-06-14] * I (150/10) [10.40.32.12]
10.40.36.0/24 via 10.40.33.3 on vlan33 [ospf1 2015-06-14] * I (150/20) [10.40.32.12]
10.40.48.0/21 via 10.40.33.2 on vlan33 [ospf1 2015-06-14] * I (150/20) [10.40.32.11]
10.40.32.10/32 dev lo [ospf1 2015-06-14] * I (150/0) [10.40.32.10]
10.40.32.11/32 via 10.40.33.2 on vlan33 [ospf1 2015-06-14] * I (150/10) [10.40.32.11]
10.40.0.0/22 via 10.40.217.17 on vlan217 [ebgp_r3 00:48:32] * (100) [AS64080i]
10.40.32.12/32 via 10.40.33.3 on vlan33 [ospf1 2015-06-14] * I (150/10) [10.40.32.12]
The last step to get the routes into the actual forwarding table inside the Linux kernel is done by the kernel protocol. Since there is no explicit name given for the kernel protocol in the configuration, BIRD just names it `kernel1`.
bird> show route export kernel1
10.40.216.0/21 via 10.40.217.17 on vlan217 [ebgp_r3 00:48:32] * (100) [AS64080i]
10.40.36.0/24 via 10.40.33.3 on vlan33 [ospf1 2015-06-14] * I (150/20) [10.40.32.12]
10.40.48.0/21 via 10.40.33.2 on vlan33 [ospf1 2015-06-14] * I (150/20) [10.40.32.11]
10.40.32.11/32 via 10.40.33.2 on vlan33 [ospf1 2015-06-14] * I (150/10) [10.40.32.11]
10.40.0.0/22 via 10.40.217.17 on vlan217 [ebgp_r3 00:48:32] * (100) [AS64080i]
10.40.32.12/32 via 10.40.33.3 on vlan33 [ospf1 2015-06-14] * I (150/10) [10.40.32.12]
Now the routes show up in the output of `ip route`, labeled with proto bird:
root@R10:/# ip r
10.40.0.0/22 via 10.40.217.17 dev vlan217 proto bird
10.40.32.11 via 10.40.33.2 dev vlan33 proto bird
10.40.32.12 via 10.40.33.3 dev vlan33 proto bird
10.40.33.0/26 dev vlan33 proto kernel scope link src 10.40.33.1
10.40.36.0/24 via 10.40.33.3 dev vlan33 proto bird
10.40.48.0/21 via 10.40.33.2 dev vlan33 proto bird
10.40.216.0/21 via 10.40.217.17 dev vlan217 proto bird
10.40.217.16/30 dev vlan217 proto kernel scope link src 10.40.217.18
Well, let's have a look what we can do with this result. Since both networks are now aware of each other's routes, I'd expect I can do some tracerouting into a remote network now!
root@R10:/# traceroute -n 10.40.2.6
traceroute to 10.40.2.6 (10.40.2.6), 30 hops max, 60 byte packets
1 10.40.217.17 0.356 ms 0.319 ms 0.324 ms
2 10.40.216.2 0.430 ms 0.427 ms 0.378 ms
3 10.40.2.6 0.781 ms 0.724 ms 0.716 ms
`R10` now knows the route to IPv4 ranges used in `AS64080`, and it seems `H6` also knows a route back to `R10`.
Let's try it from `H34`!
root@H34:/# traceroute -n 10.40.2.6
traceroute to 10.40.2.6 (10.40.2.6), 30 hops max, 60 byte packets
1 10.40.36.1 0.296 ms !N 0.091 ms !N *
Meh, that doesn't look to good. Apparently there's more work to do.
### Some assignments
Now make sure you can do the following, and answer the following questions:
* Configure the IPv6 BGP connection between `R3` and `R10`. IPv4 and IPv6 is handled separately by BIRD now, but the configuration for IPv6 is very similar to the configuration I showed here. Just use import all for bgp if you don't want to learn more about filtering now.
* Explain why `10.40.217.18` or `2001:db8:40:d910::2` on `R10` can be pinged from `R1` now, while this was not the case before:
root@R1:/# ping6 2001:db8:40:d910::2
PING 2001:db8:40:d910::2(2001:db8:40:d910::2) 56 data bytes
64 bytes from 2001:db8:40:d910::2: icmp_seq=1 ttl=63 time=0.399 ms
64 bytes from 2001:db8:40:d910::2: icmp_seq=2 ttl=63 time=0.099 ms
^C
--- 2001:db8:40:d910::2 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 0.099/0.249/0.399/0.150 ms
* Try to export a route outside of `10.0.0.0/8` over BGP, from `R3` to `R10` and notice that the filter will actually stop that route from being propagated, while accepting the other routes. Using the `show route filtered protocol ebgp_r3` command the route should be visible, thanks to the `import keep filtered on` option that is set.
* Figure out why, despite the fact that the two networks learned each others prefixes, you still cannot reach any router or host in the neighbor network that lies behing the border router. Try the following ICMP echo commands and explain why they do or don't succeed. Hint: use `tcpdump -ni vlanXYZ` on the right vlan interface to see the actual traffic, with source and destination addresses.
- `R3` -> `R10`: `root@R3:/# ping 10.40.32.10`
- `R3` -> `R11`: `root@R3:/# ping 10.40.32.11`
- `R11` -> `R3`: `root@R11:/# ping 10.40.217.3`
- `H12` -> `R1`: `root@R12:/# ping 10.40.217.1`
After explaining a bit more about the BIRD tables and protocols, we'll fix all these reachability issues.
## Intermezzo: BIRD tables, protocols, import, export
The usage of import, export, different protocols and routing tables can be a bit confusing at first. Well, at least [it was very frustrating for me](http://bird.network.cz/pipermail/bird-users/2013-January/008071.html), until [I found out](http://bird.network.cz/pipermail/bird-users/2013-January/008081.html) how to use it.
The main gotcha here is that the import and export statements are to be considered from the point of view of the BIRD routing table that is connected to the protocol (either by specifying the table option, or omitting it, using the default `master` table).
What I found out is that the easiest way to prevent confusion is to take the BIRD 'master' table as central point of reasoning, and then configure everything so that 'import' points closer to the master table, importing routes closer to the heart of BIRD, and 'export' points away from it, pushing routes to the outside world.
Here's a diagram of the BIRD configuration that we just used:
![BIRD protocols, tables, import and export](/bgp-intro/bird-prototable.png)
And here's how you should read the configuration that is in your routers right now:
* table `master` is the central routing table of BIRD
* kernel protocol `kernel1` exports routes from BIRD to Linux
* ospf protocol `ospf1` imports routes from other OSPF routers in the network into BIRD
* pipe protocol `p_master_to_r10` imports routes from its peer table `t_r10` into table `master`
* table `t_r10` is another BIRD table that contains a collection of routes with attributes
* static protocol `originate_to_r10` imports static routes into table `t_r10`
* bgp protocol `ebgp_r10` exports routes from table `t_r10` to `R10`
Note that the OSPF protocol itself also generates routes for connected subnets that are stub or non-stub networks. These routes are not imported via the kernel protocol.
The output of `show protocols` should also totally make sense now (table column width adjusted):
root@R3:/# birdc show protocols
BIRD 1.4.5 ready.
name proto table state since info
kernel1 Kernel master up 2015-06-14
device1 Device master up 2015-06-14
ospf1 OSPF master up 2015-06-14 Running
originate_to_r10 Static t_r10 up 2015-06-18
p_master_to_r10 Pipe master up 2015-06-18 => t_r10
ebgp_r10 BGP t_r10 up 2015-06-19 Established
Assignments:
* The OSPF protocol configuration that we are using does not contain any table, import or export. This means it's using the defaults, which are table master, import all, export none. Add a line specifying `import none;` to the OSPF protocol configuration, and look at the effect on the BIRD master table, and the Linux routing table.
* Change the BIRD configuration to use only the `master` table, eliminating the extra `t_r10` routing table, without changing the set of routes that are actually exported to the Linux kernel. Doing so should show that it's entirely possible, but that decreasing complexity by removing the extra table will increase complexity in the filters needed.
## Connecting the internal network
There's a last task that needs to be completed before every host and router in the two networks can see each other. As you just found out, only the border routers that actually speak BGP have learned the routes to the other network, and the internal routers still have no idea about them.
So, how should `R0` and `R1` be told about the routes from `AS65033` that are already known to `R3`?
### iBGP
BGP is not only meant to be used to connect to a router in an external network, it can also be used to connect back to routers in our own AS, to provide them with the learned information about externally reachable networks. A connection to a router in a different AS is called an eBGP connection, and, a connection to a router inside the same AS is called an iBGP connection.
In the inside network, iBGP can run alongside OSPF on the routers, the difference between them being that OSPF carries the internal routes, and BGP the external ones:
* OSPF, the IGP, contains all information about routes _inside_ our network.
* BGP, the EGP, contains all information about _external_ connectivity.
![OSPF, eBGP and iBGP](/bgp-intro/ospf-ebgp-ibgp.png)
### BIRD iBGP configuration
Here's an example for the IPv6 iBGP connection between `R3` and `R1`:
In the IPv6 BIRD configuration of `R3`, add:
protocol bgp ibgp_r1 {
import none;
export where source = RTS_BGP;
local 2001:db8:40::3 as 64080;
neighbor 2001:db8:40::1 as 64080;
}
In the IPv6 BIRD configuration of `R1`, add:
protocol bgp ibgp_r3 {
local 2001:db8:40::1 as 64080;
neighbor 2001:db8:40::3 as 64080;
}
Using the same AS number for the local and neighbor address simply tells BIRD that we're dealing with an iBGP connection.
Do a `birdc6 configure` in `R1` and `R3`, and look at the result on `R1`:
root@R1:/# birdc6 show route protocol ibgp_r3
BIRD 1.4.5 ready.
2001:db8:10::/48 via fe80::aff:fe28:d801 on vlan216 [ibgp_r3 23:26:12 from 2001:db8:40::3] * (100/20) [AS65033i]
BIRD just learned a route to the remote AS! And, because of this, `H7` in `AS64080` and `R10` in `AS65033` can now find each other:
root@H7:/# traceroute6 -n 2001:db8:10:6::a
traceroute to 2001:db8:10:6::a (2001:db8:10:6::a), 30 hops max, 80 byte packets
1 2001:db8:40:3::1 0.556 ms 0.501 ms 0.501 ms
2 2001:db8:40:d8::1 1.059 ms 1.074 ms 1.078 ms
3 2001:db8:10:6::a 1.281 ms 1.274 ms 1.268 ms
### How OSPF and BGP work together
Since BGP only handles external connectivity, the protocol does not try to be clever about routes inside the local network. When taking a closer look at the BGP route that is received by `R1`, it shows that the BGP information attached to the route only contains information about the first hop _outside_ the network, which is called the BGP next hop:
root@R1:/# birdc6
BIRD 1.4.5 ready.
bird> show route all 2001:db8:10::/48
2001:db8:10::/48 via fe80::aff:fe28:d801 on vlan216 [ibgp_r3 23:26:11 from 2001:db8:40::3] * (100/20) [AS65033i]
Type: BGP unicast univ
BGP.origin: IGP
BGP.as_path: 65033
BGP.next_hop: 2001:db8:40:d910::2
BGP.local_pref: 100
Since `R1` has only got this information, BIRD has to find out what the actual next hop to a router in a directly connected subnet has to be before a route can be exported to the Linux kernel. Luckily this is where the cooperation of the IGP comes into play. Since OSPF knows a route to `2001:db8:40:d910::2`, it can tell us where to forward the traffic in the local network to push it closer to that external BGP next hop. This is exactly the reason why the subnets that connect to routers just outside our own network are also included in OSPF as stub networks!
bird> show route for 2001:db8:40:d910::2
2001:db8:40:d910::/120 via fe80::aff:fe28:d801 on vlan216 [ospf1 2015-06-14] * I (150/20) [10.40.217.3]
Remember the section about next-hops in the OSPF tutorial? If not, go back and re-read it ("Step three: figuring out shortest paths and determining next-hops"). The same logic applies here. While this router already has a strong opinion about the path that traffic to `2001:db8:10:6::a` has to take to reach the remote network, all this knowledge gets thrown away even before the actual IP packet leaves this router... While BIRD knows the entry point in the remote network, as well as the path through the internal network to reach it, it can only install a route to the locally connected next hop into the actual forwarding routing table of the Linux kernel. The next router which receives the packet has to apply all routing logic again itself to get it forwarded into the right direction. Luckily, protocols like OSPF and BGP are designed in a way that enables us to trust that all routers that cooperate in the routing protocols have the same mindset and will perfectly work together to get the traffic to its destination without endlessly forwarding it in loops between them.
The only thing that the routers in`AS64080` know is that `R10` is the entry point for `AS65033`, and how to get there. They do not have the slightest knowledge about how the internal network of `AS65033` is organized, and there is no way for them to learn about this. When the traffic enters the remote network, that network will take care of delivering it to the actual router or host in that network.
### Can OSPF be used instead of iBGP?
After getting to know iBGP, you might still wonder: "If the routes are in the BIRD master table, and we already have the routers inside the AS talking to each other, why not just export the BGP routes into OSPF?". Well, actually, that can be done, and we can try it for fun. In order to redistribute the BGP routes into OSPF, just shut down the iBGP connections again and add the line `export where source = RTS_BGP;` to the OSPF section of both `R3` and `R10` and `birdc configure`.
For example, `R11` now shows:
root@R11:/# birdc6 show r
BIRD 1.4.5 ready.
2001:db8:10:24::/120 via fe80::aff:fe28:2103 on vlan33 [ospf1 2015-06-14] * I (150/20) [10.40.32.12]
2001:db8:10:21::/120 dev vlan33 [ospf1 2015-06-14] * I (150/10) [10.40.32.12]
2001:db8:10:30::/117 dev vlan48 [ospf1 2015-06-14] * I (150/10) [10.40.32.11]
2001:db8:10:6::a/128 via fe80::aff:fe28:2101 on vlan33 [ospf1 2015-06-14] * I (150/20) [10.40.32.10]
2001:db8:10:6::c/128 via fe80::aff:fe28:2103 on vlan33 [ospf1 2015-06-14] * I (150/20) [10.40.32.12]
2001:db8:40::/48 via fe80::aff:fe28:2101 on vlan33 [ospf1 21:00:55] * E2 (150/20/10000) [10.40.32.10]
2001:db8:40:d910::/120 via fe80::aff:fe28:2101 on vlan33 [ospf1 2015-06-14] * I (150/20) [10.40.32.10]
You can see that the route to the neighbor AS is present, but it's tagged as an 'E2' route in OSPF, instead of the usual 'I', meaning it was imported from a different routing protocol on the router that originates this prefix, `10.40.32.10`.
While using OSPF to transport the routes to the other internal routes might work in our little example network in this tutorial, it introduces a number of limitations, one of them being that all extra BGP specific information attached to a route is lost when converting it from a BGP to an OSPF route. This limits the amount of control that can be exercised on the selection of the exit point for traffic from a network to external networks. Another reason to refrain from doing this is that the full BGP table of the Internet contains more than half a million network prefixes. So if you would run a router in a location where you have all those routes in a BGP table, redistributing them to OSPF, pretending that the entire Internet is part of your local network will probably blow up your OSPF process. It's not designed to handle that. ;-)
### The usage of loopback addresses
It might have occured to you that the iBGP BIRD configuration specifies the local and remote address using loopback addresses instead of interface addresses from an actual connected subnet. Think back of the "The loopback address" section of the OSPF tutorial! The BGP router on the edge of the network, and the internal router which wants to learn about external connectivity using iBGP can be anywhere in the internal network. There may even exist multiple possible paths between them. By using a loopback address as source and target of the iBGP connection, the connection will keep functioning as long as there is any possible path between the two routers. The flow of traffic to the external network will follow the same directions as the iBGP control connection, since both of them use the IGP to reach each other.
![iBGP relying on the IGP](/bgp-intro/ibgp-loopback.png)
### Final assignments:
* Well, this one is obvious... Practice some more by finishing setting up all connectivity by configuring the iBGP sessions for IPv4 and IPv6 between `R0` and `R3`, between `R10` and `R11`, and between `R10` and `R12`. Confirm by tracerouting from `H34` and `H19` in `AS65033` to `H6` and `H7` in `AS64080`.
* If there's any part of the this first BGP tutorial that you do not understand already, make sure you will. The following tutorials will be building upon the knowledge gathered here. Don't get depressed if you don't get all of it the first time. Just go back to the top and read the page again, there's an awful lot of information compacted in this page. If you're brave, make up your own example network and try to build it from scratch. It will take some time, but as soon as you are able to traceroute from one far end to another, you've likely run into and solved all aspects you missed before.
* Look around on the internet and read other blogs and tutorials about OSPF and BGP and see if they're much more easy to understand having a frame of reference which was set by following this tutorial.
In the next tutorial, [BGP Part II](/bgp-contd/README.md), I'll show more interesting topologies of different networks connecting together using BGP than just two networks with one eBGP connection. By doing so, we'll quickly discover and understand how the actual huge Internet is organized.

BIN
bgp-intro/bgp-hey2.dia Normal file

Binary file not shown.

BIN
bgp-intro/bgp-hey2.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 43 KiB

BIN
bgp-intro/bgp-heythere.dia Normal file

Binary file not shown.

BIN
bgp-intro/bgp-heythere.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 15 KiB

BIN
bgp-intro/bgp-ospf-zoom.dia Normal file

Binary file not shown.

BIN
bgp-intro/bgp-ospf-zoom.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 29 KiB

BIN
bgp-intro/bgp-ospf.dia Normal file

Binary file not shown.

BIN
bgp-intro/bgp-ospf.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 98 KiB

Binary file not shown.

Binary file not shown.

After

Width:  |  Height:  |  Size: 42 KiB

BIN
bgp-intro/ibgp-loopback.dia Normal file

Binary file not shown.

BIN
bgp-intro/ibgp-loopback.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 28 KiB

View File

@@ -0,0 +1 @@

View File

@@ -0,0 +1 @@

View File

@@ -0,0 +1,14 @@
auto lo
iface lo inet loopback
auto vlan48
iface vlan48 inet manual
up ip link set up dev vlan48
up ip addr add 10.40.52.19/21 brd + dev vlan48
up ip addr add 2001:db8:10:30::413/117 dev vlan48
up ip route add default via 10.40.48.1 dev vlan48
up ip route add default via 2001:db8:10:30::1 dev vlan48
down ip route -6 del default
down ip addr del 2001:db8:10:30::413/117 dev vlan48
down ip addr del 10.40.52.19/21 dev vlan48
down up link set down dev vlan48

View File

@@ -0,0 +1 @@

View File

@@ -0,0 +1 @@

View File

@@ -0,0 +1,14 @@
auto lo
iface lo inet loopback
auto vlan36
iface vlan36 inet manual
up ip link set up dev vlan36
up ip addr add 10.40.36.34/24 brd + dev vlan36
up ip addr add 2001:db8:10:24::22/120 dev vlan36
up ip route add default via 10.40.36.1 dev vlan36
up ip route add default via 2001:db8:10:24::1 dev vlan36
down ip route -6 del default
down ip addr del 2001:db8:10:24::22/120 dev vlan36
down ip addr del 10.40.36.34/24 dev vlan36
down up link set down dev vlan36

View File

@@ -0,0 +1 @@

View File

@@ -0,0 +1 @@

View File

@@ -0,0 +1,14 @@
auto lo
iface lo inet loopback
auto vlan2
iface vlan2 inet manual
up ip link set up dev vlan2
up ip addr add 10.40.2.6/24 brd + dev vlan2
up ip addr add 2001:db8:40:2::6/120 dev vlan2
up ip route add default via 10.40.2.1 dev vlan2
up ip route add default via 2001:db8:40:2::1 dev vlan2
down ip route -6 del default
down ip addr del 2001:db8:40:2::6/120 dev vlan2
down ip addr del 10.40.2.6/24 dev vlan2
down up link set down dev vlan2

View File

@@ -0,0 +1 @@

View File

@@ -0,0 +1 @@

View File

@@ -0,0 +1,14 @@
auto lo
iface lo inet loopback
auto vlan3
iface vlan3 inet manual
up ip link set up dev vlan3
up ip addr add 10.40.3.7/24 brd + dev vlan3
up ip addr add 2001:db8:40:3::7/120 dev vlan3
up ip route add default via 10.40.3.1 dev vlan3
up ip route add default via 2001:db8:40:3::1 dev vlan3
down ip route -6 del default
down ip addr del 2001:db8:40:3::7/120 dev vlan3
down ip addr del 10.40.3.7/24 dev vlan3
down up link set down dev vlan3

View File

@@ -0,0 +1,26 @@
router id 10.40.217.0;
log "/var/log/bird/bird.log" all;
debug protocols { states, routes, filters, interfaces }
protocol kernel {
import none;
export all;
}
protocol device {
# defaults...
}
protocol ospf {
area 0 {
interface "lo" {
stub;
};
interface "vlan216" {
};
interface "vlan2" {
stub;
};
};
}

View File

@@ -0,0 +1,25 @@
router id 10.40.217.0;
log "/var/log/bird/bird6.log" all;
debug protocols { states, routes, filters, interfaces }
protocol kernel {
import none;
export all;
}
protocol device {
# defaults...
}
protocol ospf {
area 0 {
# BIRD ignores the IPv6 lo because it has no link local address
stubnet 2001:db8:40::/128;
interface "vlan216" {
};
interface "vlan2" {
stub;
};
};
}

View File

@@ -0,0 +1,24 @@
auto lo
iface lo inet loopback
up ip addr add 10.40.217.0/32 dev lo
up ip addr add 2001:db8:40:: dev lo
down ip addr del 2001:db8:40:: dev lo
down ip addr del 10.40.217.0/32 dev lo
auto vlan2
iface vlan2 inet manual
up ip link set up dev vlan2
up ip addr add 10.40.2.1/24 brd + dev vlan2
up ip addr add 2001:db8:40:2::1/120 dev vlan2
down ip addr del 2001:db8:40:2::1/120 dev vlan2
down ip addr del 10.40.2.1/24 dev vlan2
down up link set down dev vlan2
auto vlan216
iface vlan216 inet manual
up ip link set up dev vlan216
up ip addr add 10.40.216.2/28 brd + dev vlan216
up ip addr add 2001:db8:40:d8::2/120 dev vlan216
down ip addr del 2001:db8:40:d8::2/120 dev vlan216
down ip addr del 10.40.216.2/28 dev vlan216
down up link set down dev vlan216

View File

@@ -0,0 +1,26 @@
router id 10.40.217.1;
log "/var/log/bird/bird.log" all;
debug protocols { states, routes, filters, interfaces }
protocol kernel {
import none;
export all;
}
protocol device {
# defaults...
}
protocol ospf {
area 0 {
interface "lo" {
stub;
};
interface "vlan216" {
};
interface "vlan3" {
stub;
};
};
}

View File

@@ -0,0 +1,25 @@
router id 10.40.217.1;
log "/var/log/bird/bird6.log" all;
debug protocols { states, routes, filters, interfaces }
protocol kernel {
import none;
export all;
}
protocol device {
# defaults...
}
protocol ospf {
area 0 {
# BIRD ignores the IPv6 lo because it has no link local address
stubnet 2001:db8:40::1/128;
interface "vlan216" {
};
interface "vlan3" {
stub;
};
};
}

View File

@@ -0,0 +1,24 @@
auto lo
iface lo inet loopback
up ip addr add 10.40.217.1/32 dev lo
up ip addr add 2001:db8:40::1 dev lo
down ip addr del 2001:db8:40::1 dev lo
down ip addr del 10.40.217.1/32 dev lo
auto vlan3
iface vlan3 inet manual
up ip link set up dev vlan3
up ip addr add 10.40.3.1/24 brd + dev vlan3
up ip addr add 2001:db8:40:3::1/120 dev vlan3
down ip addr del 2001:db8:40:3::1/120 dev vlan3
down ip addr del 10.40.3.1/24 dev vlan3
down up link set down dev vlan3
auto vlan216
iface vlan216 inet manual
up ip link set up dev vlan216
up ip addr add 10.40.216.3/28 brd + dev vlan216
up ip addr add 2001:db8:40:d8::3/120 dev vlan216
down ip addr del 2001:db8:40:d8::3/120 dev vlan216
down ip addr del 10.40.216.3/28 dev vlan216
down up link set down dev vlan216

View File

@@ -0,0 +1,26 @@
router id 10.40.32.10;
log "/var/log/bird/bird.log" all;
debug protocols { states, routes, filters, interfaces }
protocol kernel {
import none;
export all;
}
protocol device {
# defaults...
}
protocol ospf {
area 0 {
interface "lo" {
stub;
};
interface "vlan33" {
};
interface "vlan217" {
stub;
};
};
}

View File

@@ -0,0 +1,25 @@
router id 10.40.32.10;
log "/var/log/bird/bird6.log" all;
debug protocols { states, routes, filters, interfaces }
protocol kernel {
import none;
export all;
}
protocol device {
# defaults...
}
protocol ospf {
area 0 {
# BIRD ignores the IPv6 lo because it has no link local address
stubnet 2001:db8:10:6::a/128;
interface "vlan33" {
};
interface "vlan217" {
stub;
};
};
}

View File

@@ -0,0 +1,24 @@
auto lo
iface lo inet loopback
up ip addr add 10.40.32.10/32 dev lo
up ip addr add 2001:db8:10:6::a dev lo
down ip addr del 2001:db8:10:6::a dev lo
down ip addr del 10.40.32.10/32 dev lo
auto vlan33
iface vlan33 inet manual
up ip link set up dev vlan33
up ip addr add 10.40.33.1/26 brd + dev vlan33
up ip addr add 2001:db8:10:21::1/120 dev vlan33
down ip addr del 2001:db8:10:21::1/120 dev vlan33
down ip addr del 10.40.33.1/26 dev vlan33
down up link set down dev vlan33
auto vlan217
iface vlan217 inet manual
up ip link set up dev vlan217
up ip addr add 10.40.217.18/30 brd + dev vlan217
up ip addr add 2001:db8:40:d910::2/120 dev vlan217
down ip addr del 2001:db8:40:d910::2/120 dev vlan217
down ip addr del 10.40.217.18/30 dev vlan217
down up link set down dev vlan217

View File

@@ -0,0 +1,26 @@
router id 10.40.32.11;
log "/var/log/bird/bird.log" all;
debug protocols { states, routes, filters, interfaces }
protocol kernel {
import none;
export all;
}
protocol device {
# defaults...
}
protocol ospf {
area 0 {
interface "lo" {
stub;
};
interface "vlan33" {
};
interface "vlan48" {
stub;
};
};
}

View File

@@ -0,0 +1,25 @@
router id 10.40.32.11;
log "/var/log/bird/bird6.log" all;
debug protocols { states, routes, filters, interfaces }
protocol kernel {
import none;
export all;
}
protocol device {
# defaults...
}
protocol ospf {
area 0 {
# BIRD ignores the IPv6 lo because it has no link local address
stubnet 2001:db8:10:6::b/128;
interface "vlan33" {
};
interface "vlan48" {
stub;
};
};
}

View File

@@ -0,0 +1,24 @@
auto lo
iface lo inet loopback
up ip addr add 10.40.32.11/32 dev lo
up ip addr add 2001:db8:10:6::b dev lo
down ip addr del 2001:db8:10:6::b dev lo
down ip addr del 10.40.32.11/32 dev lo
auto vlan48
iface vlan48 inet manual
up ip link set up dev vlan48
up ip addr add 10.40.48.1/21 brd + dev vlan48
up ip addr add 2001:db8:10:30::1/117 dev vlan48
down ip addr del 2001:db8:10:30::1/117 dev vlan48
down ip addr del 10.40.48.1/21 dev vlan48
down up link set down dev vlan48
auto vlan33
iface vlan33 inet manual
up ip link set up dev vlan33
up ip addr add 10.40.33.2/26 brd + dev vlan33
up ip addr add 2001:db8:10:21::2/120 dev vlan33
down ip addr del 2001:db8:10:21::2/120 dev vlan33
down ip addr del 10.40.33.2/26 dev vlan33
down up link set down dev vlan33

View File

@@ -0,0 +1,26 @@
router id 10.40.32.12;
log "/var/log/bird/bird.log" all;
debug protocols { states, routes, filters, interfaces }
protocol kernel {
import none;
export all;
}
protocol device {
# defaults...
}
protocol ospf {
area 0 {
interface "lo" {
stub;
};
interface "vlan33" {
};
interface "vlan36" {
stub;
};
};
}

View File

@@ -0,0 +1,25 @@
router id 10.40.32.12;
log "/var/log/bird/bird6.log" all;
debug protocols { states, routes, filters, interfaces }
protocol kernel {
import none;
export all;
}
protocol device {
# defaults...
}
protocol ospf {
area 0 {
# BIRD ignores the IPv6 lo because it has no link local address
stubnet 2001:db8:10:6::c/128;
interface "vlan33" {
};
interface "vlan36" {
stub;
};
};
}

View File

@@ -0,0 +1,24 @@
auto lo
iface lo inet loopback
up ip addr add 10.40.32.12/32 dev lo
up ip addr add 2001:db8:10:6::c dev lo
down ip addr del 2001:db8:10:6::c dev lo
down ip addr del 10.40.32.12/32 dev lo
auto vlan36
iface vlan36 inet manual
up ip link set up dev vlan36
up ip addr add 10.40.36.1/24 brd + dev vlan36
up ip addr add 2001:db8:10:24::1/120 dev vlan36
down ip addr del 2001:db8:10:24::1/120 dev vlan36
down ip addr del 10.40.36.1/24 dev vlan36
down up link set down dev vlan36
auto vlan33
iface vlan33 inet manual
up ip link set up dev vlan33
up ip addr add 10.40.33.3/26 brd + dev vlan33
up ip addr add 2001:db8:10:21::3/120 dev vlan33
down ip addr del 2001:db8:10:21::3/120 dev vlan33
down ip addr del 10.40.33.3/26 dev vlan33
down up link set down dev vlan33

View File

@@ -0,0 +1,26 @@
router id 10.40.217.3;
log "/var/log/bird/bird.log" all;
debug protocols { states, routes, filters, interfaces }
protocol kernel {
import none;
export all;
}
protocol device {
# defaults...
}
protocol ospf {
area 0 {
interface "lo" {
stub;
};
interface "vlan216" {
};
interface "vlan217" {
stub;
};
};
}

View File

@@ -0,0 +1,25 @@
router id 10.40.217.3;
log "/var/log/bird/bird6.log" all;
debug protocols { states, routes, filters, interfaces }
protocol kernel {
import none;
export all;
}
protocol device {
# defaults...
}
protocol ospf {
area 0 {
# BIRD ignores the IPv6 lo because it has no link local address
stubnet 2001:db8:40::3/128;
interface "vlan216" {
};
interface "vlan217" {
stub;
};
};
}

View File

@@ -0,0 +1,24 @@
auto lo
iface lo inet loopback
up ip addr add 10.40.217.3/32 dev lo
up ip addr add 2001:db8:40::3 dev lo
down ip addr del 2001:db8:40::3 dev lo
down ip addr del 10.40.217.3/32 dev lo
auto vlan216
iface vlan216 inet manual
up ip link set up dev vlan216
up ip addr add 10.40.216.1/28 brd + dev vlan216
up ip addr add 2001:db8:40:d8::1/120 dev vlan216
down ip addr del 2001:db8:40:d8::1/120 dev vlan216
down ip addr del 10.40.216.1/28 dev vlan216
down up link set down dev vlan216
auto vlan217
iface vlan217 inet manual
up ip link set up dev vlan217
up ip addr add 10.40.217.17/30 brd + dev vlan217
up ip addr add 2001:db8:40:d910::1/120 dev vlan217
down ip addr del 2001:db8:40:d910::1/120 dev vlan217
down ip addr del 10.40.217.17/30 dev vlan217
down up link set down dev vlan217

161
bgp-intro/lxc/fixnetwork.sh Normal file
View File

@@ -0,0 +1,161 @@
#!/bin/sh
sed -i '/lxc.network/d' R*/config H*/config
cat <<EOF >> H6/config
lxc.network.type = veth
lxc.network.flags = up
lxc.network.name = vlan2
lxc.network.veth.pair = h6.2
lxc.network.script.up = /etc/lxc/lxc-openvswitch
lxc.network.script.down = /etc/lxc/lxc-openvswitch
lxc.network.hwaddr = 02:00:0a:28:02:06
EOF
cat <<EOF >> R0/config
lxc.network.type = veth
lxc.network.flags = up
lxc.network.name = vlan216
lxc.network.veth.pair = r0.216
lxc.network.script.up = /etc/lxc/lxc-openvswitch
lxc.network.script.down = /etc/lxc/lxc-openvswitch
lxc.network.hwaddr = 02:00:0a:28:d8:02
lxc.network.type = veth
lxc.network.flags = up
lxc.network.name = vlan2
lxc.network.veth.pair = r0.2
lxc.network.script.up = /etc/lxc/lxc-openvswitch
lxc.network.script.down = /etc/lxc/lxc-openvswitch
lxc.network.hwaddr = 02:00:0a:28:02:01
EOF
cat <<EOF >> H7/config
lxc.network.type = veth
lxc.network.flags = up
lxc.network.name = vlan3
lxc.network.veth.pair = h7.3
lxc.network.script.up = /etc/lxc/lxc-openvswitch
lxc.network.script.down = /etc/lxc/lxc-openvswitch
lxc.network.hwaddr = 02:00:0a:28:03:07
EOF
cat <<EOF >> R1/config
lxc.network.type = veth
lxc.network.flags = up
lxc.network.name = vlan216
lxc.network.veth.pair = r1.216
lxc.network.script.up = /etc/lxc/lxc-openvswitch
lxc.network.script.down = /etc/lxc/lxc-openvswitch
lxc.network.hwaddr = 02:00:0a:28:d8:03
lxc.network.type = veth
lxc.network.flags = up
lxc.network.name = vlan3
lxc.network.veth.pair = r1.3
lxc.network.script.up = /etc/lxc/lxc-openvswitch
lxc.network.script.down = /etc/lxc/lxc-openvswitch
lxc.network.hwaddr = 02:00:0a:28:03:01
EOF
cat <<EOF >> R3/config
lxc.network.type = veth
lxc.network.flags = up
lxc.network.name = vlan216
lxc.network.veth.pair = r3.216
lxc.network.script.up = /etc/lxc/lxc-openvswitch
lxc.network.script.down = /etc/lxc/lxc-openvswitch
lxc.network.hwaddr = 02:00:0a:28:d8:01
lxc.network.type = veth
lxc.network.flags = up
lxc.network.name = vlan217
lxc.network.veth.pair = r3.217
lxc.network.script.up = /etc/lxc/lxc-openvswitch
lxc.network.script.down = /etc/lxc/lxc-openvswitch
lxc.network.hwaddr = 02:00:0a:28:d9:10
EOF
cat <<EOF >> R10/config
lxc.network.type = veth
lxc.network.flags = up
lxc.network.name = vlan33
lxc.network.veth.pair = r10.33
lxc.network.script.up = /etc/lxc/lxc-openvswitch
lxc.network.script.down = /etc/lxc/lxc-openvswitch
lxc.network.hwaddr = 02:00:0a:28:21:01
lxc.network.type = veth
lxc.network.flags = up
lxc.network.name = vlan217
lxc.network.veth.pair = r10.217
lxc.network.script.up = /etc/lxc/lxc-openvswitch
lxc.network.script.down = /etc/lxc/lxc-openvswitch
lxc.network.hwaddr = 02:00:0a:28:d9:11
EOF
cat <<EOF >> R11/config
lxc.network.type = veth
lxc.network.flags = up
lxc.network.name = vlan33
lxc.network.veth.pair = r11.33
lxc.network.script.up = /etc/lxc/lxc-openvswitch
lxc.network.script.down = /etc/lxc/lxc-openvswitch
lxc.network.hwaddr = 02:00:0a:28:21:02
lxc.network.type = veth
lxc.network.flags = up
lxc.network.name = vlan48
lxc.network.veth.pair = r11.48
lxc.network.script.up = /etc/lxc/lxc-openvswitch
lxc.network.script.down = /etc/lxc/lxc-openvswitch
lxc.network.hwaddr = 02:00:0a:28:30:01
EOF
cat <<EOF >> H19/config
lxc.network.type = veth
lxc.network.flags = up
lxc.network.name = vlan48
lxc.network.veth.pair = h19.48
lxc.network.script.up = /etc/lxc/lxc-openvswitch
lxc.network.script.down = /etc/lxc/lxc-openvswitch
lxc.network.hwaddr = 02:00:0a:28:34:13
EOF
cat <<EOF >> R12/config
lxc.network.type = veth
lxc.network.flags = up
lxc.network.name = vlan33
lxc.network.veth.pair = r12.33
lxc.network.script.up = /etc/lxc/lxc-openvswitch
lxc.network.script.down = /etc/lxc/lxc-openvswitch
lxc.network.hwaddr = 02:00:0a:28:21:03
lxc.network.type = veth
lxc.network.flags = up
lxc.network.name = vlan36
lxc.network.veth.pair = r12.36
lxc.network.script.up = /etc/lxc/lxc-openvswitch
lxc.network.script.down = /etc/lxc/lxc-openvswitch
lxc.network.hwaddr = 02:00:0a:28:24:01
EOF
cat <<EOF >> H34/config
lxc.network.type = veth
lxc.network.flags = up
lxc.network.name = vlan36
lxc.network.veth.pair = h34.36
lxc.network.script.up = /etc/lxc/lxc-openvswitch
lxc.network.script.down = /etc/lxc/lxc-openvswitch
lxc.network.hwaddr = 02:00:0a:28:24:22
EOF

Binary file not shown.

Binary file not shown.

After

Width:  |  Height:  |  Size: 29 KiB

120
birdhouse-intro/README.md Normal file
View File

@@ -0,0 +1,120 @@
# A basic network example
In the very first part of this tutorial, we're going to play around with the linux containers and networking a bit, and introduce the network of a fictional company that I'll be using as example in the tutorials.
The goal is to become comfortable with quickly setting up and tearing down example networks.
## The Birdhouse Factory
Well, there it is... The Birdhouse Factory network:
![Birdhouse Network](/birdhouse-intro/birdhouse-intro.png)
The Birdhouse Factory is a fictional company that manufactures little wooden birdhouses. Besides their manufactoring process and the warehouse, they have an office where accounting and sales people work.
Since the Birdhouse Factory people also like internet technology, they combined these interests and run their own webshop where you can buy birdhouses online, and their own mail server. The Factory has some IPv4 space allocated from an ISP, where they run their servers, and where they have a NAT router in front of their office network, which uses RFC1918 IPv4 network ranges.
## Cloning and configuring new containers
After following the [tutorial to set up a lab environment](/lxcbird/README.md) we end up with the first container, "birdbase". Make sure this birdbase container is stopped (by using `lxc-stop`, or typing `halt` on the container prompt after using `lxc-attach`), so it can be cloned into new ones.
lxcbird:/var/lib/lxc 0-# lxc-ls --fancy
NAME STATE IPV4 IPV6 AUTOSTART
----------------------------------------
.git STOPPED - - NO
birdbase STOPPED - - NO
Heh, `lxc-ls` is not that clever, and also thinks my git repository is a container. Oh well...
Let's create some of the systems shown in the network picture:
lxcbird:/var/lib/lxc 0-# lxc-clone -s birdbase sparrow
Created container sparrow as snapshot of birdbase
lxcbird:/var/lib/lxc 0-# lxc-clone -s birdbase weaver
Created container weaver as snapshot of birdbase
Now we need to configure the network interfaces and add a little iptables ruleset for the NAT gateway.
### Sparrow
Sparrow has two interfaces, one in vlan10, the network to run public services, and vlan60, the office network. In `sparrow/config`, network interfaces are defined:
lxc.network.type = veth
lxc.network.name = vlan10
lxc.network.veth.pair = sparrow.10
lxc.network.script.up = /etc/lxc/lxc-openvswitch
lxc.network.script.down = /etc/lxc/lxc-openvswitch
lxc.network.hwaddr = 02:00:c6:33:64:13
lxc.network.type = veth
lxc.network.name = vlan60
lxc.network.veth.pair = sparrow.60
lxc.network.script.up = /etc/lxc/lxc-openvswitch
lxc.network.script.down = /etc/lxc/lxc-openvswitch
lxc.network.hwaddr = 02:00:0a:01:3c:01
And they're configured with addresses in `sparrow/rootfs/etc/network/interfaces`:
auto lo
iface lo inet loopback
auto vlan10
iface vlan10 inet manual
pre-up iptables-restore < /etc/network/firewall
up ip link set up dev vlan10
up ip addr add 198.51.100.19/26 brd + dev vlan10
up ip route add default via 198.51.100.1 dev vlan10
down ip addr del 198.51.100.19/26 dev vlan10
down ip link set down dev vlan10
auto vlan60
iface vlan60 inet manual
up ip link set up dev vlan60
up ip addr add 10.1.60.1/24 brd + dev vlan60
down ip addr del 10.1.60.1/24 dev vlan60
down ip link set down dev vlan60
In order to activate NAT, here's the bare minimal thing to put in `sparrow/rootfs/etc/network/firewall`:
*nat
-A POSTROUTING -o vlan10 -j MASQUERADE
COMMIT
Now, start the container with `lxc-start -d -n sparrow` and get a command prompt with `lxc-attach -n sparrow`. Use `ip a`, `ip r` etc, to verify that addresses and routes are set correctly.
### Weaver
Weaver is a bit simpler, since it's just an end host with one network interface. For `weaver/config`:
lxc.network.type = veth
lxc.network.name = vlan60
lxc.network.veth.pair = weaver.60
lxc.network.script.up = /etc/lxc/lxc-openvswitch
lxc.network.script.down = /etc/lxc/lxc-openvswitch
lxc.network.hwaddr = 02:00:0a:01:3c:15
And `weaver/rootfs/etc/network/interfaces`:
auto lo
iface lo inet loopback
auto vlan60
iface vlan60 inet manual
up ip link set up dev vlan60
up ip addr add 10.1.60.21/24 brd + dev vlan60
up ip route add default via 10.1.60.1 dev vlan60
down ip addr del 10.1.60.21/24 dev vlan60
down ip link set down dev vlan60
Start weaver, get a command prompt, and see if you have proper connectivity to the outside internet. Traceroute something outside for example. If not, debug the IP addresses and routes and fix it.
## Finishing up... some assignments.
The "ISP Router" functionality can be handled by the LXC host machine, as shown in the [introduction](/lxcbird/README.md).
To finish this tutorial:
* Verify how openvswitch is used by looking at the output of `ovs-vsctl show` in the lxc host machine.
* Create a third container, the webshop server, and configure it. Confirm you can reach it from weaver, by running a SimpleHTTPServer with python (`python -m SimpleHTTPServer`) and pointing wget to it from weaver. You should see the outside IPv4 address of sparrow as source address of the request because of the NAT. Also, because of the NAT, the webshop server does not need to know a route to the `10.1.60.0/24` network, because it's hidden behind sparrow.
That's basically it. As you can see, when you get the hang of this, it's instantly also getting extremely boring to do the configuration every time. For later tutorials, I'll make sure all files that make up the starting point of the configuration are available to simply copy into the newly cloned containers.
Next up: [Meanwhile at the Birdhouse Factory...](/birdhouse-vlans-vpn/README.md)

Binary file not shown.

Binary file not shown.

After

Width:  |  Height:  |  Size: 47 KiB

View File

@@ -0,0 +1,71 @@
# The Birdhouse Factory continued
## A larger network
While we were [playing around with linux containers](/birdhouse-intro/README.md), Carl, the Birdhouse Factory sysadmin has been busy expanding the computer network at the Birdhouse Factory!
![Birdhouse network with vlans and vpn](/birdhouse-vlans-vpn/birdhouse-vlans-vpn.png)
Instead of only having a simple NAT router with a single vlan for a server and a few workstations, the following new elements have been introduced:
* Three different vlans have been created, to split networks for workstations of employees in the main office, internal servers and the warehouse, which has network-connected manufacturing equipment.
* VPN server software has been installed to provide employees with the ability to work outside the office. Carl has configured the VPN to push a route to clients that contains all three office vlans, `10.1.32.0/19`.
* A branch office has been opened in a new location, and an IP in IP tunnel has been established to provide access to the servers in the main office.
* Using the firewall on the main office gateway, access is granted from the office network and remote branch office network to the internal server network, but not between workstations in different locations, and not from workstations to the warehouse network.
## Separating concerns...
One thing that's bothering Carl is the amount of functionality that's currently running on the single router "sparrow". Besides being the office gateway, it's also a VPN server and gateway to an external office now. But, not visible in the picture, it's also a webmail proxy, backup server, it runs nagios and munin and an outgoing mail relay.
In order to make maintenance and upgrades easier, Carl would like to refactor the network in a way so that functionality is split over multiple servers.
## Separate routers
Carl fires up his network diagram editor, loads the drawing of his network layout, and starts changing it:
![Birdhouse network with split routers](/birdhouse-vlans-vpn/birdhouse-vlans-vpn-split.png)
Two new routers have been introduced here, a separate VPN gateway, "pigeon", and a separate IP tunnel gateway "owl".
Both of the new routers have an interface in the public network:
* The VPN server "pigeon" needs to have a public accessible IP address for VPN clients to connect to.
* The tunnel gateway connecting the branch office to the internal network needs a public address to be able to connect to the remote tunnel endpoint.
Although this is a nice first step, Carl realizes it's not ready yet. Something is missing.
The internal network has been split up, and the various parts of it cannot communicate with each other any more. Using the public network segment to point RFC1918 routes to the other routers is not really an option, since it will result in complex firewall/NAT exceptions, because of the SNAT rules for outgoing traffic, which rewrite the RFC1918 addresses. So, as a best-practice, Carl does not like to mix RFC1918 with public routable addresses on the same vlan, knowing it will cause too many headaches.
## An internal routing vlan
Carl decides to introduce an extra vlan, which is going to be used for exchanging traffic between the routers:
![Birdhouse network with split routers and internal routing vlan](/birdhouse-vlans-vpn/birdhouse-vlans-vpn-split-routing-vlan.png)
Using this extra vlan, each router can be configured with routes to the rest of the network. This is already much better.
However, the next question immediately rises... In order to make sure each part of the network can be reached from each other part, Carl would have to program all the following routes into the separate routers:
Office Gateway "sparrow":
* `10.1.18.0/24 via 10.1.32.3`
* `10.1.33.66/31 via 10.1.32.3`
* `10.1.62.0/24 via 10.1.32.2`
VPN Gateway "pigeon":
* `10.1.18.0/24 via 10.1.32.3`
* `10.1.33.66/31 via 10.1.32.3`
* `10.1.59.0/24 via 10.1.32.1`
* `10.1.60.0/24 via 10.1.32.1`
* `10.1.63.0/24 via 10.1.32.1`
Tunnel gateway "owl":
* `10.1.59.0/24 via 10.1.32.1`
* `10.1.60.0/24 via 10.1.32.1`
* `10.1.62.0/24 via 10.1.32.2`
* `10.1.63.0/24 via 10.1.32.1`
Carl realizes this is going to turn into a real nightmare when the network keeps expanding in the future, and starts looking for a better way to handle all the routes.
It would be cool if the routers could just talk to each other on the internal routing lan and tell each other which networks are reachable via them...
Next: [An introduction to OSPF](/ospf-intro/README.md)

Binary file not shown.

After

Width:  |  Height:  |  Size: 55 KiB

Binary file not shown.

Binary file not shown.

After

Width:  |  Height:  |  Size: 69 KiB

Binary file not shown.

Binary file not shown.

After

Width:  |  Height:  |  Size: 60 KiB

327
lxcbird/README.md Normal file
View File

@@ -0,0 +1,327 @@
# Setting up a lab environment
For the tutorials, I've chosen to use Debian GNU/Linux with lxc, btrfs and openvswitch and some extra git thrown at it to build simulations of complete networks. The current stable Debian Release, 8 (Jessie) already contains everything you need for this.
So, make sure you get your hands on an empty physical or virtual machine. The one I use is a standard Debian x86-64 (a.k.a. amd64) virtual machine.
* LXC provides a very lightweight way to run containers with their own network namespace, separated from each other.
* I use btrfs subvolumes as container filesystems.
* The advantage of using openvswitch is that we can very easily run a vlan capable switch, by just configuring ports on it as either access or trunk port with any of the vlan numbers assigned, much like you would do with a physical switch.
* Creating a git repository just outside the container root filesystems with a .gitignore that only includes specific files allows to easily store the configuration of a whole test network. For example, when destroying a complete container and cloning it from another container again, a simple git checkout is enough to put the configuration inside the container back in place.
Here's a simple schematic overview of what I mean:
![lxc-openvswitch-topology](/lxcbird/lxc-openvswitch-topology.png)
## Some basic packages
To be able to create containers and hook up their network interfaces to openvswitch, we need the following packages:
apt-get install lxc debootstrap openvswitch-switch
## Setting up networking
The lxc host system only needs a single external network interface, for you to manage it, and probably to masquerade outgoing traffic from the test environment, using NAT. It's of course also possible to route some real address space to this box for use in the test networks, but I'm not doing so.
Here's the `/etc/network/interfaces` of my lxc host, well, almost, since I replaced the eth0 addresses with fakes:
lxcbird:/etc/network 0-# cat interfaces
auto lo
iface lo inet loopback
auto eth0
iface eth0 inet manual
up ip link set up dev eth0
up ip addr add 10.255.1.34/24 brd + dev eth0
up ip addr add 2001:db8:ffff::22/64 dev eth0
up ip route add default via 10.255.1.1 dev eth0
up ip route add default via 2001:db8:ffff::1 dev eth0
down ip -6 route del default
down ip addr del 2001:db8:ffff::22/64 dev eth0
down ip addr del 10.255.1.34/24 dev eth0
down ip link set down dev eth0
allow-ovs ovs0
iface ovs0 inet manual
pre-up ovs-vsctl add-br ovs0
up ip link set up dev ovs0
down ip link set down dev ovs0
post-down ovs-vsctl del-br ovs0
allow-ovs vlan10
iface vlan10 inet manual
pre-up ovs-vsctl add-port ovs0 vlan10 tag=10 -- set interface vlan10 type=internal
up ip link set up dev vlan10
up ip addr add 198.51.100.1/24 brd + dev vlan10
up ip addr add 2001:db8:1998::1/120 dev vlan10
down ip addr del 2001:db8:1998::1/120 dev vlan10
down ip addr del 198.51.100.1/24 dev vlan10
down ip link set down dev vlan10
post-down ovs-vsctl del-port ovs0 vlan10
As you can see, I'm not a fan of the default way `network/interfaces` works in Debian. Actually, I always use manual mode and (pre-)up and (post-)down rules to set up and tear down everything. This is more convenient if you don't like magic too much, and often work with multiple addresses and extra commands on interfaces.
## Masquerading outgoing traffic
To enable masquerading outgoing traffic from the test networks, make sure you enable IP forwarding, for example by putting the following settings in the `/etc/sysctl.conf`...
net.ipv4.ip_forward = 1
net.ipv6.conf.all.forwarding = 1
net.ipv6.conf.default.forwarding = 1
...and by using a few simple netfilter rules to do the NAT, like...
*nat
-A POSTROUTING -o eth0 -j MASQUERADE
COMMIT
*filter
:FORWARD DROP [0:0]
-A FORWARD -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT
-A FORWARD -i vlan10 -o eth0 -j ACCEPT
COMMIT
...which can be done for IPv4, as well as for IPv6, because NAT for IPv6 has finally been implemented. For test environments like this, it's very helpful, since we can just use documentation addresses from `2001:db8::/32` and are still able to access the outside internet if needed.
## Setting up version control
lxcbird:/var/lib/lxc 0-# git init
Initialized empty Git repository in /var/lib/lxc-bird/.git/
Now make sure your `.gitignore` looks like this, to include only very specific files from all containers:
lxcbird:/var/lib/lxc 0-# cat .gitignore
*.log
*/*
!*/config
!*/rootfs
*/rootfs/*
!*/rootfs/etc/
*/rootfs/etc/*
!*/rootfs/etc/hosts
!*/rootfs/etc/sysctl.conf
!*/rootfs/etc/network/
*/rootfs/etc/network/*
!*/rootfs/etc/network/interfaces
!*/rootfs/etc/network/firewall
!*/rootfs/etc/network/firewall6
!*/rootfs/etc/bird/
*/rootfs/etc/bird/*
!*/rootfs/etc/bird/bird.conf
!*/rootfs/etc/bird/bird6.conf
lxcbird:/var/lib/lxc 0-# git add .gitignore
lxcbird:/var/lib/lxc 0-# git commit -m "Only include specific files from containers"
[master (root-commit) 8ecfeec] Only include specific files from containers
1 file changed, 20 insertions(+)
create mode 100644 .gitignore
## Creating the first container
Here's an example to create a first container, which we'll configure a bit and use as a template to clone all other containers from.
lxcbird:/var/lib/lxc 0-# MIRROR=http://ftp.nl.debian.org/debian lxc-create -t debian -B btrfs -n birdbase -- -r jessie
### Configure the network and openvswitch up/down script
In `birdbase/config`, lxc-create has put some basic configuration. The networking configuration has to be set up now, so we can test our connectivity and install some extra software. To be able to do so, I'm going to configure it with an IPv4 and IPv6 address in the range of vlan10, and point my default gateway to the lxc host system.
In the config file, instead of...
lxc.network.type = empty
...it should look more like this...
lxc.network.type = veth
lxc.network.name = vlan10
lxc.network.veth.pair = birdbase.10
lxc.network.flags = up
lxc.network.script.up = /etc/lxc/lxc-openvswitch
lxc.network.script.down = /etc/lxc/lxc-openvswitch
...oh, and by the way, the lxc network script referenced is a really simple script to integrate lxc with openvswitch, which simply attaches an interface in the container to a vlan inside openvswitch based on the number after the dot. It has to be present on the host system, not in the container:
lxcbird:/etc/lxc 0-# cat lxc-openvswitch
#!/bin/sh
# $1 container name
# $2 config section name (net)
# $3 execution context (up/down)
# $4 network type (empty/veth/macvlan/phys)
# $5 (host-sided) device name
if [ "$3" = "up" ]
then
vlan=$(echo "$5" | awk -F . '{ print $NF }')
ovs-vsctl add-port ovs0 $5 tag=$vlan
else
ovs-vsctl del-port ovs0 $5
fi
Instead of setting the container IP address and gateway in the lxc configuration file, I prefer using network/interfaces inside the container, because we'll be using that for more complex networking anyway in the tutorials:
lxcbird:/var/lib/lxc/birdbase 0-# cat rootfs/etc/network/interfaces
auto lo
iface lo inet loopback
auto vlan10
iface vlan10 inet manual
up ip link set up dev vlan10
up ip addr add 198.51.100.254/24 brd + dev vlan10
up ip addr add 2001:db8:1998::fe/120 dev vlan10
up ip route add default via 198.51.100.1 dev vlan10
up ip route add default via 2001:db8:1998::1 dev vlan10
down ip -6 route del default
down ip addr del 2001:db8:1998::fe/120 dev vlan10
down ip route del default
down ip addr del 198.51.100.254/24 dev vlan10
down ip link set down dev vlan10
### Prevent Debian from installing unnecessary packages
Now, before starting it, we need to finish up a few basic configuration settings...
lxcbird:/var/lib/lxc/birdbase 0-# echo 'APT::Install-Recommends "false";' > rootfs/etc/apt/apt.conf.d/00InstallRecommends
I hate the default of installing recommends in Debian, so I always turn that off. Generally, it's recommended to install Recommends when using Debian, so it installs other packages that are 'generally found together with these ones'. Generally, I don't really see the pattern in this, and I recommend to just try disabling Recommends to see which issues you run into, so you learn more about how related software works together. Anyway, for your minimal BIRD lxc container, we won't run into any problem doing so now.
### Start!
Now, let's try to start it and see what happens!
lxcbird:/var/lib/lxc/birdbase 0-# lxc-start -d -n birdbase
lxcbird:/var/lib/lxc/birdbase 0-# lxc-attach -n birdbase
root@birdbase:/#
root@birdbase:/# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
215: vlan10: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 02:00:c6:33:64:fe brd ff:ff:ff:ff:ff:ff
inet 198.51.100.254/24 brd 198.51.100.255 scope global vlan10
valid_lft forever preferred_lft forever
inet6 2001:db8:1998::fe/120 scope global
valid_lft forever preferred_lft forever
inet6 fe80::c6ff:fe33:64fe/64 scope link
valid_lft forever preferred_lft forever
Let's verify if we have proper outgoing network connectivity!
root@birdbase:/# ping knorrie.org
bash: ping: command not found
Oh, there's our first problem... it's still a bit too basic :)
## Finishing our birdbase container
root@birdbase:/# apt-get update
[...]
root@birdbase:/# apt-get install iputils-ping bird dnsutils iptables iptstate mtr-tiny tcpdump nmon traceroute iftop iperf3
[...]
The fact that we can do this already proves networking is set up right!
root@birdbase:/# ping -n -c 3 knorrie.org
PING knorrie.org (82.94.188.77) 56(84) bytes of data.
64 bytes from 82.94.188.77: icmp_seq=1 ttl=54 time=6.64 ms
64 bytes from 82.94.188.77: icmp_seq=2 ttl=54 time=5.12 ms
64 bytes from 82.94.188.77: icmp_seq=3 ttl=54 time=3.91 ms
--- knorrie.org ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2002ms
rtt min/avg/max/mdev = 3.913/5.228/6.646/1.118 ms
root@birdbase:/# ping6 -n -c 3 knorrie.org
PING knorrie.org(2001:888:2177::4d) 56 data bytes
64 bytes from 2001:888:2177::4d: icmp_seq=1 ttl=53 time=5.51 ms
64 bytes from 2001:888:2177::4d: icmp_seq=2 ttl=53 time=3.99 ms
64 bytes from 2001:888:2177::4d: icmp_seq=3 ttl=53 time=3.39 ms
--- knorrie.org ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2002ms
rtt min/avg/max/mdev = 3.398/4.302/5.513/0.890 ms
And now ping confirms it. Both IPv4 and IPv6 masquerading works.
### BIRD auto start
Now, enable starting bird, since for some reason this is not automatically done when installing it:
root@birdbase:/# systemctl enable bird
Synchronizing state for bird.service with sysvinit using update-rc.d...
Executing /usr/sbin/update-rc.d bird defaults
Executing /usr/sbin/update-rc.d bird enable
root@birdbase:/# systemctl enable bird6
Synchronizing state for bird6.service with sysvinit using update-rc.d...
Executing /usr/sbin/update-rc.d bird6 defaults
Executing /usr/sbin/update-rc.d bird6 enable
### BIRD logfile location
Since there is no separate syslog process in the container, create a directory where we can point logging configuration to later:
root@birdbase:/# mkdir /var/log/bird
root@birdbase:/# chown bird: /var/log/bird
root@birdbase:/# true > /var/log/bird/bird.log; chown bird: /var/log/bird/bird.log
root@birdbase:/# true > /var/log/bird/bird6.log; chown bird: /var/log/bird/bird6.log
The creation of the log file is necessary to work around a bug in the Debian packaging, that causes the logfile to be created with root as owner, and subsequent causes bird startup to fail because it cannot write to the log file as user bird. :-(
### IP forwarding
For IP forwarding, make sure you uncomment `net.ipv4.ip_forward=1` and `net.ipv6.conf.all.forwarding=1` in sysctl.conf inside the container. Hint: editing configuration files inside a container can be done from outside the container, by looking for them in the `rootfs` folder inside the container directories.
## Disabling icmp error rate limiting
Since we'll be doing a lot of tracerouting in the example networks, it's nice to disable icmp error rate limiting in sysctl.conf, to prevent hickups while executing quick subsequent traceroute commands:
net.ipv4.icmp_ratelimit = 0
net.ipv6.icmp.ratelimit = 0
You probably wouldn't want to do this in a production network. For more information, see [the blog post "A strange packet loss"](http://backreference.org/2012/11/16/a-strange-packet-loss/)
### Root password
You might also want to change the password for root, since it's set to some random string by default.
## Cleanup
Before the birdbase container is ready as a template to be used for cloning other containers, let's shut it down and remove some container-specific configuration, so we won't accidentally start a new one with duplicate configuration, and, to make the diff look nicer when configuring a clone:
lxcbird:/var/lib/lxc 1-# lxc-stop -n birdbase
lxcbird:/var/lib/lxc 1-# sed -i /^lxc.network/d birdbase/config
lxcbird:/var/lib/lxc 1-# /bin/true > birdbase/rootfs/etc/bird/bird.conf
lxcbird:/var/lib/lxc 1-# /bin/true > birdbase/rootfs/etc/bird/bird6.conf
lxcbird:/var/lib/lxc 1-# /bin/true > birdbase/rootfs/etc/network/interfaces
Finally, we can check that git only wants to store our bird and network configuration, and do so:
lxcbird:/var/lib/lxc/birdbase 0-# git status
On branch master
Untracked files:
(use "git add <file>..." to include in what will be committed)
./
nothing added to commit but untracked files present (use "git add" to track)
lxcbird:/var/lib/lxc/birdbase 0-# git add .
lxcbird:/var/lib/lxc/birdbase 0-# git status
On branch master
Changes to be committed:
(use "git reset HEAD <file>..." to unstage)
new file: config
new file: rootfs/etc/sysctl.conf
new file: rootfs/etc/bird/bird.conf
new file: rootfs/etc/bird/bird6.conf
new file: rootfs/etc/network/interfaces
lxcbird:/var/lib/lxc/birdbase 0-# git commit -m "birdbase network and bird config"
[...]
Right! As you might notice, there are "end-hosts" in the drawing on top of this page, and we just configured the base container to start bird and enable IP forwarding. While this is not needed, or wanted for the end host containers, I don't really care, because it will not influence the working of the test environment, as bird has no configuration, end hosts will not attract traffic that's not for themselves. However, if you like, you can create two different containers to clone from, one for a router, and one for an end host.
Let's head over to the next page to [meet the Birdhouse Factory network](/birdhouse-intro/README.md)!

Binary file not shown.

Binary file not shown.

After

Width:  |  Height:  |  Size: 33 KiB

381
ospf-intro/README.md Normal file
View File

@@ -0,0 +1,381 @@
OSPF
====
In this tutorial, I'll explain a bit about how the OSPF routing protocol works, followed by a full hands-on tutorial to practice and see it in action yourself!
In the [previous page of this tutorial series](/birdhouse-vlans-vpn/README.md), we met the Birdhouse Factory sysadmin Carl, who was about to get a little depressed about the amount of manual work he needed to do to get his newly created set of routers forward traffic into the right direction. He wondered if there would be a better way to do this, having the routers just tell each other what traffic should go where.
Luckily, such a thing actually exists. Instead of programming all routers in detail with the next hop for each possible subnet that exists in the network, it's possible to let the routers figure this out themselves.
## Step one: separate routers with interfaces
The network in this tutorial has four routers. Each of these routers has connections to multiple networks:
![OSPF network, separate](/ospf-intro/ospf-separate.png)
Each of the four routers is shown as a little "card" with information on it, which contains:
* A unique IP address-like number (in bold), which is a so called 'Router ID'.
* Several interfaces, which are connected to different subnets, having a unique address in each of those subnets.
* A "stub" sign that is either missing or present on the interface.
If the network is a "stub", it's like a dead end road, and we expect that the network contains end hosts (like servers and workstations), but no other routers. If the network is not a stub, we consider it a network where other routers are present, and usually no end hosts.
So what's different between the network drawings of the Birdhouse Factory that we've seen in the previous tutorials? On the last page, we learned that sysadmin Carl had designed the whole network on paper first, and then had to program all routes to other networks into each router in the network. The image above looks like the complete opposite. We have four routers, and each of them only knows about its directly connected subnets, and has no idea at all about the existence of the other three routers. This seems strange, but it's an ideal starting point for a dynamic routing protocol like OSPF. The topology of an OSPF network is not laid out in advance. The network completely discovers itself "on the go".
## Step two: activating OSPF, discovering network topology
When looking at the picture with the four separate routers, you should already have noticed that some of the interfaces of different routers share the same subnet. For example, `10.0.1.5` of R1 and `10.0.1.4` of R5 are in the same subnet, and connected to the same vlan. This means they should be able to communicate with each other. Only, they don't know about each other yet.
*Warning: the following is a grossly oversimplified description of the OSPF discovery process, but just enough to grasp the basics that we need to know before trying it.*
### Discovering local neighbours
To discover which routers can directly see each other, the following is being done by them:
* Send out Hello! packets, describing themselves (containing, most importantly, their Router ID) on all interfaces that actively participate in the OSPF network (so all non-stub interfaces).
* Listen for Hello! packets from other routers, to learn who else is active out there.
![OSPF network with discovered neigbours](/ospf-intro/ospf-neighbours.png)
Now, the upper part of the network contains three routers which know they are neigbours, as does the lower part. But, router 2 and 6 do not know about each others existence yet.
### Discovering the full network topology
To discover the full topology of the network, the following happens:
* Send out the complete information card of the router itself, with all information of active "links" that the router has to all neighbours.
* Receive the same information from neighbour routers.
* When an information card of a router arrives, store it, and send it out again on all other interfaces that are not a stub, unless the information of this particular router was already received before.
And magically... After a short burst of network traffic, all routers have now received each other's information, and are in possesion of the four cards with information. Now that each router has all information about the other ones, let's see what happens when we simply connect the information about the four different routers together, turning the shared subnet ranges into a subnet between the routers:
![OSPF network, joined together](/ospf-intro/ospf-together.png)
And now here's a very important characteristic of the OSPF protocol: *Each* of the four routers can do this, and each of them now has an overview of the complete topology of the network! And, all of them have the *exact* same one.
## Step three: figuring out shortest paths and determining next-hops
Now that each router has gathered enough information to assemble a complete detailed map of the complete network, the meaning of the abbreviation OSPF comes into play. OSPF means "Open Shortest Path First". Using the details of the network topology map, it's possible to find out the shortest path to every subnet that exists in the network (being stub or not). If you want to know how this is done, look up the Dijkstra's algorithm, which is the mathematical algorithm that is used by OSPF.
While each router can determine the complete shortest path to any destination in the network, it might sound quite disappointing to know that most of this valuable information can not be used by any individual router to get an IP packet to its destination.
* First of all, the routing software (for which BIRD will be used in these tutorials) is not in charge of the forwarding of packets itself (which is done by the Linux kernel in these examples). This difference is well known as the difference between a "Control Plane" and a "Forwarding Plane", which have nothing to do with aircrafts.
* The routing sofware (control plane) has to program the kernel (forwarding plane) with a next hop router for each existing subnet in the network, and it can not provide more information than just that next hop, which has to be in a directly connected subnet. So the OSPF routing process knows much more about the path that the to be forwarded packet will travel than it is able to tell the forwarding path.
And why shouldn't we care too much about all of this? The fun part here is that each of the participating routers in the OSPF network knows exactly the same amount of information about the whole network topology, and uses the same way to calculate the shortest routes from itself to all subnets that exist in the network. So, it's not a problem at all that the OSPF process on R2 can only tell the linux kernel to forward packets for `10.34.2.89` to a next hop of `10.1.2.56`, because it can trust on the fact that R5 will always forward them again to `10.0.1.8`, having R6 receive them, which will drop them into the connected subnet to reach the end host in there.
If you're confused now, don't worry. Take a short break and continue with the hands-on part to see it all happen! After doing so, re-read the above, which should probably make more sense then.
If you're not confused yet, here's some bonus information to think about: To make it even worse, the IP address of this next hop is not even used when actually forwarding a packet to the next router! It's only being used to determine the layer 2 mac address of the interface of the next hop. :-) An IP packet contains a source and a destination IP address. It does not contain a list of routers that it needs to pass before reaching its destination. It does not even contain the address of the very first next hop that it needs to go to, and the receiving next router has no idea where the packet it just received has been, or which IP address was used to forward it. It only sees the mac address of the sending router.
# Hands on!
Enough of this theoretical babble! Let's create the network ourselves, using some linux containers and vlans with openvswitch!
Ok, first of all, to be honest, the stub links still look a bit sad, so let's connect a host to them, which we will use later, when we'll actually building the network, to execute tests between them to see if they can reach each other, and if so, using what route over the network:
![OSPF network, joined together with some hosts](/ospf-intro/ospf-together-hosts.png)
## Building the containers
Let's build some containers! If you don't have [a lab environment](/lxcbird/README.md) with a template 'birdbase' container yet, create it!
To create the eight containers we need, connected together in different networks, the following steps are needed:
1. Clone this git repository somewhere to be able to use some files from the ospf-intro/lxc/ directory inside.
2. lxc-clone the birdbase container several times:
lxc-clone -s birdbase R1
lxc-clone -s birdbase R2
lxc-clone -s birdbase R5
lxc-clone -s birdbase R6
lxc-clone -s birdbase H12
lxc-clone -s birdbase H10
lxc-clone -s birdbase H8
lxc-clone -s birdbase H5
3. Set up the network interfaces in the lxc configuration. This can be done by removing all network related configuration that remains from the cloned birdbase container, and then appending all needed interface configuration by running the fixnetwork.sh script that can be found in `ospf-intro/lxc/` in this git repository. Of course, have a look at the contents of the script first, before executing it. Since this example is only using IPv4 and single IP addresses on the interfaces, I simply added them to the lxc configuration instead of the network/interfaces file inside the container.
. ./fixnetwork.sh
4. Copy extra configuration into the containers. The ospf-intro/lxc/ directory inside this git repository contains a little file hierarchy that can just be copied over the configuration of the containers. For each router, it's a network/interfaces configuration file which adds an IP address that corresponds with the Router ID to the loopback interface, and a simple BIRD configuration file that serves as a starting point for our next steps.
5. Start all containers
for router in 1 2 5 6; do lxc-start -d -n R$router; sleep 2; done
for host in 12 10 8 5; do lxc-start -d -n H$host; sleep 2; done
6. Verify connectivity and look around a bit. Here's an example for R1:
lxc-attach -n R1
root@R1:/#
root@R1:/# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet 10.9.99.1/32 scope global lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
247: vlan1001: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 02:00:0a:00:01:05 brd ff:ff:ff:ff:ff:ff
inet 10.0.1.5/24 brd 10.0.1.255 scope global vlan1001
valid_lft forever preferred_lft forever
inet6 fe80::aff:fe00:105/64 scope link
valid_lft forever preferred_lft forever
249: vlan1012: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 02:00:0a:01:02:07 brd ff:ff:ff:ff:ff:ff
inet 10.1.2.7/24 brd 10.1.2.255 scope global vlan1012
valid_lft forever preferred_lft forever
inet6 fe80::aff:fe01:207/64 scope link
valid_lft forever preferred_lft forever
251: vlan1356: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 02:00:0a:03:38:01 brd ff:ff:ff:ff:ff:ff
inet 10.3.56.1/24 brd 10.3.56.255 scope global vlan1356
valid_lft forever preferred_lft forever
inet6 fe80::aff:fe03:3801/64 scope link
valid_lft forever preferred_lft forever
root@R1:/# ip r
10.0.1.0/24 dev vlan1001 proto kernel scope link src 10.0.1.5
10.1.2.0/24 dev vlan1012 proto kernel scope link src 10.1.2.7
10.3.56.0/24 dev vlan1356 proto kernel scope link src 10.3.56.1
root@R1:/# ping -c 3 10.3.56.8
PING 10.3.56.8 (10.3.56.8) 56(84) bytes of data.
64 bytes from 10.3.56.8: icmp_seq=1 ttl=64 time=0.545 ms
64 bytes from 10.3.56.8: icmp_seq=2 ttl=64 time=0.084 ms
64 bytes from 10.3.56.8: icmp_seq=3 ttl=64 time=0.078 ms
--- 10.3.56.8 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 1998ms
rtt min/avg/max/mdev = 0.078/0.235/0.545/0.219 ms
root@R1:/#
root@R1:/# traceroute 10.34.2.5
traceroute to 10.34.2.5 (10.34.2.5), 30 hops max, 60 byte packets
connect: Network is unreachable
Looking good! :-) Feel free to test some more and by attaching to the console of containers and playing around.
## Basic BIRD configuration
To be able to use the OSPF routing protocol, we need to run a program on each router that implements it. BIRD, the BIRD Internet Routing Daemon is one of those and I'll be using it for all examples and tutorials here.
Let's have a look at the BIRD configuration of R6:
# cat R6/rootfs/etc/bird/bird.conf
router id 10.9.99.6;
log "/var/log/bird/bird.log" all;
debug protocols { states, routes, filters, interfaces }
protocol kernel {
import none;
export all;
}
protocol device {
# defaults...
}
This is a really basic BIRD configuration:
* The Router ID is set to a unique value in the network, which is 10.9.99.6 for this router. Actually, this is not an IP address. It's just a 32 bit value, but it's written in the same notation we use for IPv4 address, instead of 0x0a096306 or 168387334.
* A kernel protocol. This is not a real routing protocol, but it's the way BIRD uses to export route information from the internal BIRD routing table to the Linux kernel (remember, the control plane programs the forwarding plane). The filters are set to export all routes that BIRD will be learning from other routers to the Linux kernel routing table, which is fine for now.
* A device protocol. This is also not a real routing protocol, but the way BIRD uses to import information about the local network interfaces that are already present in this routers. Usually this is added to your BIRD configuration and just sits there, doing its thing.
As you have seen above, all of the routers currently only see their connected subnets. R1 which was used as an example above has no idea how to reach a computer with IP address 10.34.2.5, because it has no available route to a network this address is in:
root@R1:/# ip r
10.0.1.0/24 dev vlan1001 proto kernel scope link src 10.0.1.5
10.1.2.0/24 dev vlan1012 proto kernel scope link src 10.1.2.7
10.3.56.0/24 dev vlan1356 proto kernel scope link src 10.3.56.1
`ip r` shows the Linux kernel route table, which is used to actually forward packets. The BIRD process has its own internal routing table, which can also be shown:
root@R1:/# birdc show route
BIRD 1.4.5 ready.
root@R1:/#
Well, actually it's still empty now. :-)
birdc is a little program which connects to a running BIRD process for diagnostics and like manupulation of the running protocols, like disabling or enabling them:
root@R1:/# birdc
BIRD 1.4.5 ready.
bird> show route
bird> show ?
show bfd ... Show information about BFD protocol
show interfaces Show network interfaces
show memory Show memory usage
show ospf ... Show information about OSPF protocol
show protocols [<protocol> | "<pattern>"] Show routing protocols
show roa ... Show ROA table
show route ... Show routing table
show static [<name>] Show details of static protocol
show status Show router status
show symbols ... Show all known symbolic names
bird> show status
BIRD 1.4.5
Router ID is 10.9.99.1
Current server time is 2015-06-07 00:51:52
Last reboot on 2015-06-07 00:02:37
Last reconfiguration on 2015-06-07 00:43:57
Daemon is up and running
bird>
bird> show protocols
name proto table state since info
kernel1 Kernel master up 00:02:37
device1 Device master up 00:02:37
## Configuring OSPF
The moment we've all been waiting for! Add the following to bird.conf of R6. Editing the bird.conf file can be done from outside the container.
protocol ospf {
area 0 {
interface "lo" {
stub;
};
interface "vlan1001" {
};
interface "vlan1034" {
stub;
};
};
}
Now, tell BIRD to reload the configuration:
root@R6:/# birdc
BIRD 1.4.5 ready.
bird> configure
Reading configuration from /etc/bird/bird.conf
Reconfigured
An OSPF protocol instance has been started now, which has been provided with information that closely relates to the little "router information cards" seen earlier:
![OSPF R6](/ospf-intro/ospf-r6.png)
The interface in network `10.0.1.0/24` has been configured as active OSPF interface. The interface in `10.34.2.0/24` is a stub, and also the Linux loopback interface has been specified as stub, causing the `10.9.99.6/32` address that is present on the loopback interface to be included in the OSPF process.
After starting OSPF, the Linux kernel routing table still looks unchanged, but the routing table inside BIRD has changed:
root@R6:/# ip r
10.0.1.0/24 dev vlan1001 proto kernel scope link src 10.0.1.8
10.34.2.0/24 dev vlan1034 proto kernel scope link src 10.34.2.1
root@R6:/# birdc show route
BIRD 1.4.5 ready.
10.0.1.0/24 dev vlan1001 [ospf1 00:58:01] * I (150/10) [10.9.99.6]
10.9.99.6/32 dev lo [ospf1 00:58:01] * I (150/0) [10.9.99.6]
10.34.2.0/24 dev vlan1034 [ospf1 00:58:01] * I (150/10) [10.9.99.6]
Since BIRD has been told which interfaces are participating in the OSPF protocol, it has been able to determine which network ranges are active on those interfaces. When talking to other OSPF routers, this is the information that will be sent to them in the R6 information message!
There are more useful commands to show what R6 is currently seeing in the OSPF network:
bird> show ospf topology
area 0.0.0.0
router 10.9.99.6
distance 0
bird> show ospf neighbors
ospf1:
Router ID Pri State DTime Interface Router IP
Only, the output is not very exciting, because apparently there are no other routers available yet which can answer the Hellos that R6 is sending onto vlan1001...
## A second OSPF router, or more!
The inevitable must happen. Right now, you should know enough to be able to configure OSPF on R1 as well, which has two active OSPF interfaces, the stub interface which connects to the network with H8, and of course the loopback interface.
Go ahead, do it, now!
After adding the configuration and making BIRD reload it, `birdc show protocols` should show an active OSPF protocol. Now, just wait for a few seconds and do `ip r` again on R6, which shows us the routing table that is actually used by the forwarding process:
root@R6:/# ip r
10.0.1.0/24 dev vlan1001 proto kernel scope link src 10.0.1.8
10.1.2.0/24 via 10.0.1.5 dev vlan1001 proto bird
10.3.56.0/24 via 10.0.1.5 dev vlan1001 proto bird
10.9.99.1 via 10.0.1.5 dev vlan1001 proto bird
10.34.2.0/24 dev vlan1034 proto kernel scope link src 10.34.2.1
In the interactive BIRD console, `show route` can be used to see the view that BIRD has on the network. You can see that the three routes that have nexthop `10.0.1.5` were learned from router `10.9.99.1`, which is the Router ID of R1.
bird> show route
10.0.1.0/24 dev vlan1001 [ospf1 2015-06-07] * I (150/10) [10.9.99.6]
10.1.2.0/24 via 10.0.1.5 on vlan1001 [ospf1 22:51:52] * I (150/20) [10.9.99.1]
10.3.56.0/24 via 10.0.1.5 on vlan1001 [ospf1 2015-06-07] * I (150/20) [10.9.99.1]
10.9.99.1/32 via 10.0.1.5 on vlan1001 [ospf1 2015-06-07] * I (150/10) [10.9.99.1]
10.9.99.6/32 dev lo [ospf1 2015-06-07] * I (150/0) [10.9.99.6]
10.34.2.0/24 dev vlan1034 [ospf1 2015-06-07] * I (150/10) [10.9.99.6]
I guess it's not very useful any more to continue typing much more text in this tutorial page now, because I'm quite surely losing your attention. :-D Just go ahead, and configure OSPF on the other two routers and see what happens. One fun thing to do is to start a `watch ip r` on R6 and see live changes of what will happen while you're working on the other routers.
When enabling OSPF on all four routers, you should be able to reach anything from anything in the whole network.
root@H12:/# traceroute -n 10.34.2.5
traceroute to 10.34.2.5 (10.34.2.5), 30 hops max, 60 byte packets
1 10.50.1.1 0.199 ms 0.227 ms 0.119 ms
2 10.1.2.56 0.238 ms 0.234 ms 0.284 ms
3 10.0.1.8 0.283 ms 0.282 ms 0.332 ms
4 10.34.2.5 0.310 ms 0.389 ms 0.246 ms
Or...
root@H12:/# mtr -n -c 3 -r 10.3.56.8
Start: Sun Jun 7 01:46:49 2015
HOST: H12 Loss% Snt Last Avg Best Wrst StDev
1.|-- 10.50.1.1 0.0% 3 0.1 0.2 0.1 0.3 0.0
2.|-- 10.1.2.7 0.0% 3 0.1 0.2 0.1 0.3 0.0
3.|-- 10.3.56.8 0.0% 3 0.1 0.2 0.1 0.4 0.0
## More Dynamics!
There's two more topics I want to cover before ending this tutorial. The first one is how the network will handle changes in the availability of paths.
Attach to H12 and start an `mtr -n 10.34.2.5`. In my case here, it shows a path via `10.50.1.1` (R2), `10.1.2.56` (R5) and `10.0.1.8` (R6). Now, just for fun, do an `ip link set down vlan1012` on R5 if your traceroute shows the same path, or if the route is over R1, just down an interface on R1 instead, and look what's happening to your running mtr output. Doing this is equivalent to pulling a network cable out of a network port of a "real" router.
Here's mine:
My traceroute [v0.85]
H12 (0.0.0.0) Sun Jun 7 01:56:35 2015
Keys: Help Display mode Restart statistics Order of fields quit
Packets Pings
Host Loss% Snt Last Avg Best Wrst StDev
1. 10.50.1.1 0.0% 230 0.1 0.1 0.1 0.4 0.0
2. 10.1.2.56 0.9% 230 0.1 0.1 0.1 0.3 0.0
10.1.2.7
3. 10.0.1.8 0.9% 230 0.1 0.1 0.1 0.4 0.0
4. 10.34.2.5 0.9% 230 0.1 0.1 0.1 0.4 0.0
![OSPF network, reconvergence](/ospf-intro/ospf-together-hosts-linkdown.png)
When I disabled the interface on R5, BIRD on R5 got notified by netlink that the interface went down. OSPF on R5 had to change its information card immediately and send it out again. But... it was only able to send it out on the `10.0.1.0/24` network. So it did, and R1 and R6 received it. Since R1 had not seen an update on the lower side of the network, it notified routers in there of the change and R2 was able to recalculate the shortest paths to the entire network after changing its view of the complete network topology with the missing link between R5 and the `10.1.2.0/24` network. After doing so, R2 determined that the current open shortest path to `10.34.2.5` had to be via `10.1.2.7` and used the BIRD kernel protocol to retract the route to `10.34.2.0/24` via `10.1.2.56` and inserted a new route into the Linux kernel routing table which points to `10.1.2.7` as next hop for `10.34.2.0/24`. And then, mtr noticed there was a change in the path.
Apparently, I lost a ping while the network was busy to get into a stable converged state again. ;-(
## The loopback address
The second thing I want to point out is about the /32 addresses on the loopback interfaces of the routers. I figure you might be wondering what they're useful for. Well, normally, a /32 address on a network interface would not make much sense. But image what happens when we include it in our OSPF process... It suddenly becomes a network subnet whose reachability information is propagated throughout the whole network. Ok, this subnet can only contain a single address, but it's a perfect way to make sure that if any path exists to this single router in the network, OSPF will make you able to use it to connect to the router. So, if I'm the network administrator of the example network we've just built, and `10.50.1.12` is my workstation, I can use `10.9.9.5` to connect to, for example with SSH, to manage this router. Even when I accidentally would disable the link to the `10.1.2.0/24` network, my SSH session would simply stay active, the traffic to and from R5 being rerouted via R1 back to my workstation... :-D Later on, in the BGP tutorial we'll see that there are actually other routing protocols that rely on this mechanism to function correctly.
## Next...
There are numerous pages with information about OSPF on the internet. Since I could not really find one of them that did not directly deep-dive into 100s of pages of concepts like different type of LSAs, DR, BDR, Areas, and a lot of other complex things instead of just showing that a bunch of routers can talk to each other, I created this tutorial to prove routing protocols are fun, and to encourage you to have fun building networks. :-)
First of all, don't forget to take a look at the BIRD documentation about OSPF. You can find it at User's guide -> Protocols -> OSPF at the [BIRD web page](http://bird.network.cz/). There's a lot more options than "stub". :) While I just proved you don't need to know about them to set up an interesting network with dynamic routing, there must be scenarios in which they can be very useful. For example:
* If there are untrusted hosts inside your routing vlans, you might want to use password authentication.
* If you want to decrease the time until the network gets reconfigured when a router crashes without notifying anyone, you might want to play with hello timers, or even bfd.
* Equal cost multipath routing (ECMP) is a big thing nowadays, which is used a lot to load balance traffic over multiple paths to a destination instead of choosing only one as best path. You can even enable that in the network we just built by just specifying `ecmp yes` in the OSPF configuration (try it on R2 or R6) and see what effect it has on the output of `ip r` on the linux command line. Just search for information on it on the Internet to learn more.
* 'Cost' is an aspect that is fundamental to OSPF and the calculation of the shortest paths in the network. Traditionally, cost is related to the bandwith of a link between routers, and causes higher bandwith connections to be prefered above lower bandwith connections. Since we're working with switched Gigabit/s networks by default now, if it's not 10Gb/s, in the datacenter and even in our office, I've just been ignoring that.
Another thing you can play with is rolling out IPv6 on this little network that was just built. It needs a `bird6.conf` configuration file, and you'll soon find out doing IPv6 is very similar to what we did here with IPv4. Just pick some subnets from the `2001:db8::/32` network to work with and there you go.
After completing this tutorial, I also encourage you to start reading the other "An Introduction to OSPF" like pages on the internet, since they should be a lot easier to understand while having seen it work for real! Have fun.
Next: [An introduction to BGP](/bgp-intro/README.md)

View File

@@ -0,0 +1,13 @@
router id 10.9.99.1;
log "/var/log/bird/bird.log" all;
debug protocols { states, routes, filters, interfaces }
protocol kernel {
import none;
export all;
}
protocol device {
# defaults...
}

View File

@@ -0,0 +1,4 @@
auto lo
iface lo inet loopback
up ip addr add 10.9.99.1/32 dev lo
down ip addr del 10.9.99.1/32 dev lo

View File

@@ -0,0 +1,13 @@
router id 10.9.99.2;
log "/var/log/bird/bird.log" all;
debug protocols { states, routes, filters, interfaces }
protocol kernel {
import none;
export all;
}
protocol device {
# defaults...
}

View File

@@ -0,0 +1,4 @@
auto lo
iface lo inet loopback
up ip addr add 10.9.99.2/32 dev lo
down ip addr del 10.9.99.2/32 dev lo

View File

@@ -0,0 +1,13 @@
router id 10.9.99.5;
log "/var/log/bird/bird.log" all;
debug protocols { states, routes, filters, interfaces }
protocol kernel {
import none;
export all;
}
protocol device {
# defaults...
}

Some files were not shown because too many files have changed in this diff Show More