NX-OSv 9000 Automation (2)

In part one of this series we looked at starting up a couple of Nexus9000v machines using a tool called vagrant. It went OK, but we had some unfinished business. In this post we'll look at how I try to address the MAC address issues and run my first ansible playbook against this lab.

UPDATE: The first two posts in this series detail the path to the solution that I'm presenting in part 3. The journey taught me a lot, so if you're not in a hurry, it's worth reading through.

A new and improved Vagrantfile

Vagrant, when booting up machines, needs SSH access in order to do initial provisioning, but more importantly to confirm that the machine is up and running after boot-up. In this case, it can't do that because the NXOS CLI is not really a shell it knows how to handle, something I discovered the hard way, trying to fix something that's not really fixable (at least currently).

In the DevNet instructions on how to set up a n9kv box in vagrant it's mentioned that you have to wait for vagrant to time out its SSH attempts after box boot-up. I didn't like that (or pressing Ctrl-C to stop it trying pointlessly over and over again) so I looked at the first problem: vagrant couldn't log into the router. I fixed it by adding the lines below to the Vagrantfile and telling it to not try to insert an SSH key and simply use the password.

    config.ssh.password = "vagrant"
    config.ssh.insert_key = false

What happened next is that vagrant could finally log into the machine, but it immediately failed because it couldn't figure out what shell it had. It seemed like a small victory, until I tried to stop the machines with vagrant halt. The command failed with the same shell error and the machines kept running.

I wanted to make this work so I could call an ansible playbook as the vagrant provisioner after the first boot of each machine to fix MAC addresses, hostnames etc. It turns out that's not really possible in this case (please let me know if there's a way!) so manual it'll have to be.

The Vagrantfile below should be pretty familiar from part one with the addition of a couple forwarded ports. Normally vagrant forwards port 22 from the guest machine to a high port on the host so that you can SSH into every box. I wanted this high port to be fixed for each instance and also to forward 443 for the REST NXAPI.

Vagrant.configure("2") do |config|

    # Deploy 2 Nodes with two links between them

    config.vm.define "n9k1" do |node|
        node.vm.box = "n9000v-4gb-ssh"
        node.vm.base_mac = "0800276CEEAA"

        # port forward for ssh and nxapi https calls
        node.vm.network "forwarded_port", guest: 22, host: 2221, auto_correct: true, id: "ssh"
        node.vm.network "forwarded_port", guest: 443, host: 4431, auto_correct: true

        # eth1/1 connected to nxeth1, auto-config not supported.
        node.vm.network :private_network, virtualbox__intnet: "nxeth1",
          auto_config: false, mac: "0800276CEE15"

        # eth1/2 connected to nxeth2, auto-config not supported.
        node.vm.network :private_network, virtualbox__intnet: "nxeth2",
          auto_config: false, mac: "0800276CEE16"

        node.vm.provider "virtualbox" do |v|
            v.customize ["modifyvm", :id, "--nicpromisc2", "allow-all"]
            v.customize ["modifyvm", :id, "--nicpromisc3", "allow-all"]
        end
    end

    config.vm.define "n9k2" do |node|
        node.vm.box = "n9000v-4gb-ssh"
        node.vm.base_mac = "0800276DEEAA"

        # port forward for ssh and nxapi https calls
        node.vm.network "forwarded_port", guest: 22, host: 2222, auto_correct: true, id: "ssh"
        node.vm.network "forwarded_port", guest: 443, host: 4432, auto_correct: true

        # eth1/1 connected to nxeth1, auto-config not supported.
        node.vm.network :private_network, virtualbox__intnet: "nxeth1",
          auto_config: false, mac: "0800276DEE15"

        # eth1/2 connected to nxeth2, auto-config not supported.
        node.vm.network :private_network, virtualbox__intnet: "nxeth2",
          auto_config: false, mac: "0800276DEE16"

        node.vm.provider "virtualbox" do |v|
            v.customize ["modifyvm", :id, "--nicpromisc2", "allow-all"]
            v.customize ["modifyvm", :id, "--nicpromisc3", "allow-all"]
        end
    end

end

Trying to connect

Before I delve into the ansible goodies, I want to mention something that was not really obvious from the start: the 2 (or more) vagrant n9kv boxes will have different host keys, but sit on the same IP (127.0.0.1). And guess who thinks it's a security breach? Our SSH client!

I bumped into this issue in a rather funny way: I could use vagrant ssh n9k1 and vagrant ssh n9k2 to log into the 2 machines, but when I ran a simple ansible playbook against both it failed to connect to n9k2. Worked fine via the REST NXAPI but wouldn't work on SSH, which finally gave me the solution.

So it turns out ansible checks the host keys by default (sensible) but can be configured to ignore them should it be necessary, all documented here. I created ansible.cfg in the same folder as the playbook and told it to ignore host keys, job done.

[defaults]
host_key_checking = False

Finally, some ansible

By this point I was itching to get ansible (through the included network modules) to successfully call the NXAPI via both CLI(SSH) and REST(HTTPS). And then try to solve that non-unique MAC address issue and get the 2 machines to ping each other.

It took a bit to figure out how to build the host inventory file and the provider details in the playbook so that the nxos_* modules could connect to the machines. The ansible documentation is not really clear, but with the help of bits of code found in various github repos I managed to get it working.

First, the inventory file. Yes, it's not pretty, but it works. I wanted to keep things in one place for now, but the next step will be to use ansible concepts such as host-vars and group-vars.

[all:vars]
ansible_connection = local

[n9k]
n9k1 ansible_ssh_host=127.0.0.1 ansible_ssh_port=2221 ansible_nxapi_port=4431 eth11_mac="0800.276C.EE15" eth12_mac="0800.276C.EE16" eth11_ip="10.0.1.1/24" eth12_ip="10.0.2.1/24"
n9k2 ansible_ssh_host=127.0.0.1 ansible_ssh_port=2222 ansible_nxapi_port=4432 eth11_mac="0800.276D.EE15" eth12_mac="0800.276D.EE16" eth11_ip="10.0.1.2/24" eth12_ip="10.0.2.2/24"

The nxos_* modules can be given the necessary parameters each time they are called (gets messy quickly) or a previously defined dictionary that contains all of them using the provider option. In the vars section of the playbook I now have defined two providers to use (ssh and nxapi) which use some of the parameters defined in the inventory file above.

 vars:
    ssh:
      host: "{{ ansible_ssh_host }}"
      username: "vagrant"
      password: "vagrant"
      transport: cli

    nxapi:
      host: "{{ ansible_ssh_host }}"
      username: "vagrant"
      password: "vagrant"
      transport: nxapi
      use_ssl: yes
      validate_certs: no
      port: "{{ ansible_nxapi_port }}"

That initial setup

I have nxapi enabled in my enhanced base box image (see part one for details on how to build it), but seeing how by default only SSH is available, this playbook will use the SSH provider to perform all the necessary setup steps and enable the REST NXAPI.

The playbook below is also in the github repo together with all other supporting files.

---
- name: POST VAGRANT UP PROVISIONING
  hosts: n9k
  gather_facts: no

  vars:
    ssh:
      host: "{{ ansible_ssh_host }}"
      username: "vagrant"
      password: "vagrant"
      transport: cli

  tasks:
    - name: configure hostname and domain-name
      nxos_system:
        provider: "{{ ssh }}"
        hostname: "{{ inventory_hostname }}"
        domain_name: vagrantlab.local

    - name: ensure nxapi is enabled
      nxos_feature:
        feature: nxapi
        state: enabled
        provider: "{{ ssh }}"

    - name: setup eth1/1
      nxos_config:
        lines:
          - "no switchport"
          - "mac-address {{ eth11_mac }}"
          - "ip address {{ eth11_ip }}"
          - "no shutdown"
        parents: interface ethernet1/1
        match: strict
        provider: "{{ ssh }}"

    - name: setup eth1/2
      nxos_config:
        lines:
          - "no switchport"
          - "mac-address {{ eth12_mac }}"
          - "ip address {{ eth12_ip }}"
          - "no shutdown"
        parents: interface ethernet1/2
        match: strict
        provider: "{{ ssh }}"


    - name: test ping from n9k1 to n9k2
      when: inventory_hostname == "n9k1"
      nxos_ping:
        dest: "{{ item }}"
        provider: "{{ ssh }}"
      with_items:
        - 10.0.1.2
        - 10.0.2.2

    - name: save config
      nxos_config:
        save: yes
        provider: "{{ ssh }}"

This playbook is built around the two machines in this lab setup, so it is best to run it after both machines have been started (you will have to execute vagrant up n9k1, let it boot, then run vagrant up n9k2). You can tell the machines are done booting when your CPU stops having a heart attack, but also when vagrant ssh n9k2 gets you into the NXOS CLI.

All that's left now is to run the playbook, which changes the hostname of the boxes (they come from the same base-image so both will be named the same initially), configures eth1/1 and eth1/2 as L3 ports, overriding the MAC address so that traffic can actually work through them. Finally, it confirms connectivity with a simple ping from n9k1 to n9k2 and saves the configuration.

> ansible-playbook -i hosts provision.yml 

PLAY [POST VAGRANT UP PROVISIONING] ******************************************

TASK [configure hostname and domain-name] ************************************
changed: [n9k1]
changed: [n9k2]

TASK [ensure nxapi is enabled] ***********************************************
ok: [n9k2]
ok: [n9k1]

TASK [setup eth1/1] **********************************************************
changed: [n9k2]
changed: [n9k1]

TASK [setup eth1/2] **********************************************************
changed: [n9k2]
changed: [n9k1]

TASK [test ping from n9k1 to n9k2] *******************************************
skipping: [n9k2] => (item=10.0.1.2) 
skipping: [n9k2] => (item=10.0.2.2) 
ok: [n9k1] => (item=10.0.1.2)
ok: [n9k1] => (item=10.0.2.2)

TASK [save config] ***********************************************************
changed: [n9k2]
changed: [n9k1]

PLAY RECAP *******************************************************************
n9k1                       : ok=6    changed=4    unreachable=0    failed=0   
n9k2                       : ok=5    changed=4    unreachable=0    failed=0

To confirm that both providers work, I've included another playbook that pulls "facts" from the two machines via both SSH and REST, displaying the LLDP neighbors from each machine.

Are we there yet?

Almost. While it may look like we solved the MAC address problem, this manual override only works for L3 ports. If you'd like to test with L2 ports, it won't have it, so the various L2 protocols will be broken because of the non-unique MACs. The search continues!

It has been an interesting learning experience nonetheless and I hope that this will help others going down this path. Stay tuned for the next part of this series, where I expand and improve the ansible setup!

And, as always, thanks for reading.

NX-OSV 9000 AUTOMATION (2)

A new and improved Vagrantfile

Trying to connect

Finally, some ansible

That initial setup

Are we there yet?

Any comments? Contact me via Mastodon or e-mail.

Share & Subscribe!