Updating firmware for each blade server independently

If you are updating firmware for blade servers that are operating as independent partitions in a scalable blade complex or you are performing out-of-band updates to firmware for blade servers in a scalable blade complex, updates must be applied to each system in the scalable complex independently.

You can obtain the firmware updates from the http://www.ibm.com/support/ fixcentral/systemx/groupView? query.productGroup=ibm%2FBladeCenter.

Complete the following steps to update the firmware for the blade servers:
Note: The blade servers in the scalable blade complex must be at the same firmware levels before they are restarted.
  1. Update the IMM firmware on the primary blade server. Then update the IMM firmware on the secondary blade server.
  2. Reset the IMM on the primary and secondary systems. Complete the following steps to reset the IMM through the advanced management module web interface:
    1. Click Blade Tasks > Power/Restart.
    2. Click the checkbox next to the blade servers to be reset.
    3. Click Available actions > Restart Blade System Mgmt Processor.
    4. Click Perform Action.
  3. Update the UEFI firmware on the primary blade server. Then update the UEFI firmware on the secondary blade server.
  4. Update the FPGA firmware on the primary blade server. Then update the FPGA firmware on the secondary blade server.
  5. Update the DSA preboot firmware on the primary blade server. Then update the DSA Preboot firmware on the secondary blade server.
  6. Restart both blade servers to activate the firmware
You can also use an Expect type script to automate advanced management module command-line interface (CLI) commands for updating the firmware for both blade servers. Complete the following steps to use an Expect type script:
  1. Download the firmware for the BladeCenter HX5 blade server from the http://www.ibm.com/support/ fixcentral/systemx/groupView? query.productGroup=ibm%2FBladeCenter. Place the files on a TFTP server that is on the same TCP/IP subnet as the advanced management module for the chassis in which the blade servers are installed.
    Note: Remember to record the directory location on the TFTP server where you place the files; you will need that location to run the Expect type script.
  2. Generate an Expect type script that will log in to the advanced management module CLI, update the firmware for the blade servers, and restart the blade servers when complete.
  3. From a computer that is on the same TCP/IP subnet as the advanced management module for the chassis in which the scalable blade complex is installed, run the Expect type script.

Example of an Expect script

The following script illustrates how an Expect type script might be created to update the firmware for both blade servers.

Important: IBM does not provide support for Expect scripts. For more information about using Expect, see the http://expect.sourceforge.net/Expect website at http://expect.sourceforge.net/. For more information about using the advanced management module CLI, see the ../com.ibm.bladecenter.advmgtmod.doc/adv_man_mod_printable_doc.htmlBladeCenter Advanced Management Module Command-Line Interface Reference Guide at http://publib.boulder.ibm.com/infocenter/bladectr/documentation/topic/com.ibm.bladecenter.advmgtmod.doc/adv_man_mod_printable_doc.html.
#!/usr/bin/expect

################################################################################
#                                                                              #
#  This tool has been built from the following sources:                        #
#                                                                              #
#  support/FlashMNBladeViaAmm.exp                     : 1.1                    #
#  support/include/Log.exp                            : 1.9                    #
#  support/include/AMM.exp                            : 1.29                   #
#  support/include/MultiNode.exp                      : 1.1                    #
#  support/include/FlashBlade.exp                     : 1.16                   #
#                                                                              #
################################################################################

################################################################################
#                                                                              #
#  Code from source      : support/include/Log.exp                             #
#                                                                              #
################################################################################

################################################################################
#                                                                              #
#  Globals.                                                                    #
#                                                                              #
################################################################################

log_user 0
exp_internal -f /tmp/diag.txt 0

set fm_logfile ""
set g_normal_timeout_value 30
set timeout $g_normal_timeout_value

################################################################################
#                                                                              #
#  Init the logging system.                                                    #
#                                                                              #
################################################################################

proc log_init { display_stdout } {
    global fm_logfile

    if {$display_stdout == 0} {
        log_user 1
    }

    set fm_logfile "/tmp/expect_logs.txt"
}

################################################################################
#                                                                              #
#  Log to a directory.                                                         #
#                                                                              #
################################################################################

proc log_init_directory { directory } {
    global fm_logfile

    set fm_logfile "$directory/ExpectLogs.txt"
    exp_internal -f $directory/ExpectDiag.txt 0
}

################################################################################
#                                                                              #
#  Temp hack.                                                                  #
#                                                                              #
################################################################################

proc log_init_custom { logfile } {
    global fm_logfile 
    global g_module_name
    global g_test_results_base_dir

    #
    #  Cache the module name.
    #

    set g_module_name $logfile

    #
    #  Figure out the logfile path.
    #

    test_results_set_base_dir

    #
    #  Set it.
    #

    set fm_logfile "$g_test_results_base_dir/Logfile.txt"
}

################################################################################
#                                                                              #
#  Capture a log message with a nice time stamp.                               #
#                                                                              #
################################################################################

proc ft_log { message } {
    global fm_logfile

    set date_val [ timestamp -format "%m/%d: %X: " ]

    log_file $fm_logfile
    send_log -- "$date_val $message\n"
    log_file

    send_user -- "$date_val $message\n"
}

################################################################################
#                                                                              #
#  Bail on a critical error.                                                   #
#                                                                              #
################################################################################

proc ft_error { message } {
    ft_log "ERROR: $message"
    puts "\n\nERROR: $message"
    exit
}

################################################################################
#                                                                              #
#  Code from source      : support/include/AMM.exp                             #
#                                                                              #
################################################################################

################################################################################
#                                                                              #
#  Globals.                                                                    #
#                                                                              #
################################################################################

set amm_id ""                          ;#  Spawn ID for AMM ssh connection.
set save_amm ""                        ;#  Save pointer of original amm value.
set save_target ""                     ;#  Save pointer for current AMM state.
set save_userid ""                     ;#  Save pointer of original userid value.
set save_password ""                   ;#  Save pointer of original password value.
array set g_imm_fw_levels  { }         ;#  Array of IMM firmware levels.
array set g_uefi_fw_levels { }         ;#  Array of uEFI firmware levels.

################################################################################
#                                                                              #
#  Unexpected EOF handler.                                                     #
#                                                                              #
################################################################################

proc eof_handler { } {
    global save_amm save_userid save_password

    ft_log "Unexpected EOF talking to AMM."

    #
    #  Clean up any zombies. 
    #

    catch {close -i $amm_id}
    wait -nowait

    #
    #  The AMM closed the connection on us -- try to resume.
    #

    set amm_id ""

    set rv [ amm_login $save_amm $save_userid $save_password ] 
    set rv [ amm_restore_save_target ]
}

################################################################################
#                                                                              #
#  Save off the current target value.                                          #
#                                                                              #
################################################################################

proc amm_save_target { string } {
    global save_target

    set save_target $string
}

################################################################################
#                                                                              #
#  Restore the AMM to its saved target value.                                  #
#                                                                              #
################################################################################

proc amm_restore_save_target { } {
    global amm_id save_target

    send -i $amm_id "env -T $save_target\r"

    expect -i $amm_id -exact "OK" {
        return 0
    }

    ft_error "Unable to restore AMM target after disconnect."
}

################################################################################
#                                                                              #
#  Handy function to collect all flash failure logs for a given blade.         #
#                                                                              #
################################################################################

proc collect_flash_failure_logs { blade } {
    global g_target_blade g_test_results_dir

    #
    #  Create a storage space for our output.
    #

    set g_target_blade $blade
    set rv [ test_results_set_cwd ]

    #
    #  Have to be on an MM[N] target.
    #

    set rv [ amm_set_mm_target ]

    #
    #  Grab the VDBG data from the AMM.
    #

    ft_log "Blade: $blade -- Collecting AMM vdbg log."

    set vdbg_output "$g_test_results_dir/AMM_vdbg.txt"
    set rv [ collect_vdbg $vdbg_output ]

    if {$rv == 0} {
        ft_log "Blade: $blade -- Successfully collected AMM vdbg log."
    } else {
        ft_log "Blade: $blade -- Failure collecting AMM vdbg log."
    }

    #
    #  Grab the FFDC data from the IMM.
    #

    ft_log "Blade: $blade -- Collecting IMM FFDC logs."

    set rv [ imm_ffdc_init_capture $blade ]
    set rv [ imm_ffdc_collect_capture $blade ]
    set fn [ imm_ffdc_get_service_file_name $blade ]
    set rv [ collect_file_from_amm service "." $fn $g_test_results_dir/IMM_FFDC.tgz ]

    if {$rv == 0} {
        ft_log "Blade: $blade -- Successfully collected IMM FFDC data."
    } else {
        ft_log "Blade: $blade -- Failure collecting IMM FFDC data."
    }

    #
    #  Cleanup.
    #

    set rv [ imm_ffdc_cleanup_amm $fn ]
}

################################################################################
#                                                                              #
#  Reset all of the configured blades in the chassis.                          #
#                                                                              #
################################################################################

proc reset_all_blades { } {
    global blade_presence_bits

    for {set slot 1} {$slot < 15} {incr slot 1} {

        if { ! [info exists blade_presence_bits($slot)]} {
            continue
        }

        set present $blade_presence_bits($slot)

        if {$present == 1} {

            set rv [ reset_blade $slot ]

            if {$rv != 0} {
                ft_log "Blade: $slot did not reboot."
            }
        }
    }
}

################################################################################
#                                                                              #
#  Reset a blade via the AMM.  Returns 0 on success and 1 on timeout.          #
#                                                                              #
################################################################################

proc reset_blade { blade } {
    global amm_id

    #
    #  Reboot the blade.
    #

    send -i $amm_id "reset -T blade\[$blade\]\r\n"

    expect -i $amm_id "OK" {
        ft_log "Blade $blade:  Rebooted host OS." 
        return 0
    } timeout {
        return 1
    }
}

################################################################################
#                                                                              #
#  Reset a blade via the AMM.  Returns 0 on success and 1 on timeout.          #
#                                                                              #
################################################################################

proc reset_blade_gator { blade } {
    global amm_id

    #
    #  Gator zap.
    #

    set gator_map    { 1 2 3 4 5 6 7 8 9 a b c d e f }
    set gator_offset [lindex $gator_map $blade_no]

    send -i $amm_id "dbg gator x $gater_offset -Tsystem:mm\[1\]\r\n"

    expect -i $amm_id "OK" {
        ft_log "Blade $blade:  Gator zap."
        return 0
    } timeout {
        return 1
    }
}

################################################################################
#                                                                              #
#  Reboot the AMM.                                                             #
#                                                                              #
################################################################################

proc reboot_amm { } {
    global amm_id

    #
    #  Reboot the AMM.
    #

    send -i $amm_id "reset\r"

    #
    #  The AMM CLI needs to have the session opened until it goes away.
    #

    sleep 10

    ft_log "AMM:  Rebooted."

    return 0
}

################################################################################
#                                                                              #
#  Set the MM target to the value.                                             #
#                                                                              #
################################################################################

proc amm_set_mm_target { } {
    global amm_id

    #
    #  We should discover what bay the MM is in, hardcoded to 1 right now.
    #

    set mm 1

    send -i $amm_id "env -T system:mm\[$mm\]\r"
    expect -i $amm_id -exact "system:mm\[$mm\]"

    expect -i $amm_id "OK" { 
        set rv [ amm_save_target "system:mm\[$mm\]" ]
        return 0 
    }

    return 1
}

################################################################################
#                                                                              #
#  Set the CLI target to 'system'.                                             #
#    Returns 0 on success and 1 on failure.                                    #
#                                                                              #
################################################################################

proc amm_set_system_target { } {
    global amm_id

    send -i $amm_id "env -T system\r"

    expect -i $amm_id "OK" { 
        set rv [ amm_save_target "system" ]
        return 0 
    }

    return 1
}

################################################################################
#                                                                              #
#  Set the CLI target to a blade.                                              #
#    Returns 0 on success and 1 on failure.                                    #
#                                                                              #
################################################################################

proc amm_set_blade_target { blade_no } {
    global amm_id

    send -i $amm_id "env -T system:blade\[$blade_no\]\r"

    expect -i $amm_id "OK" { 
        set rv [ amm_save_target "system:blade\[$blade_no\]" ]
        return 0 
    }

    return 1
}

################################################################################
#                                                                              #
#  Collect the current SOL ready status.                                       #
#                                                                              #
################################################################################

proc blade_collect_sol_ready_status { } {
    global amm_id

    send -i $amm_id "sol\r"

    expect -i $amm_id "OK" {
    } timeout {
        return 1
    }

    expect -i $amm_id "SOL Session: Ready" {
        return 0
    }

    return 1
}


################################################################################
#                                                                              #
#  Log into the AMM.                                                           #
#                                                                              #
################################################################################

proc amm_login { amm userid password } {
    global amm_id save_amm save_userid save_password

    #
    #  Backup our login creds.
    #

    set save_amm $amm
    set save_userid $userid
    set save_password $password

    #
    #  SSH command with no host key checking.
    #

    spawn ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -l$userid $amm

    set amm_id $spawn_id
    ft_log "AMM: Login -- id: $amm_id"

    #
    #  Install an end of file handler to bomb out incase the AMM connection dies.
    #

    expect_after -i $amm_id eof eof_handler 

    #
    #  Log into the AMM.
    #

    expect {
        "password:" {
        send "$password\r"
        }
    }

    #
    #  Make sure we made it.
    #

    expect -exact "system>"
}

################################################################################
#                                                                              #
#  Log out of the AMM.  (Be nice to the CLI, it won't run commands some times  #
#  if you close the connection on it too soon).                                #
#                                                                              #
################################################################################

proc amm_logout { } {
    global amm_id 

    #
    #  Log out and let the CLI figure out what happened.
    #

    send -i $amm_id "exit\r"
    catch {close -i $amm_id}

    #
    #  Reap the child process.
    #

    wait

    ft_log "AMM: Logout -- id: $amm_id"
    set amm_id ""
}

################################################################################
#                                                                              #
#  Collect the blade info from the AMM.                                        #
#                                                                              #
################################################################################

proc collect_blade_info { blade } {
    global amm_id amm g_imm_fw_levels g_uefi_fw_levels

    send -i $amm_id "info -T blade\[$blade\]\r"

    #
    #  Find the BIOS string.
    #

    expect -i $amm_id "BIOS" {

        expect -i $amm_id "Build ID:" {

            expect -i $amm_id "\n" {

                set temp $expect_out(buffer)
                set length [ string length ${temp} ]
                set length [ expr $length - 3 ]

                set uefi_level [string range ${temp} 1 $length]
                set g_uefi_fw_levels($blade) $uefi_level
            }
        }
    }

    #
    #  Find the SP string.
    #

   expect -i $amm_id -re "Blade*" {

        expect -i $amm_id "Build ID:" {

            expect -i $amm_id "\n" {

                set temp $expect_out(buffer)
                set length [ string length ${temp} ]
                set length [ expr $length - 3 ]

                set imm_level [string range ${temp} 1 $length]
                set g_imm_fw_levels($blade) $imm_level
            }
        }
    }

    return 0
}

################################################################################
#                                                                              #
#  Collect the blade power state from the AMM.                                 #
#                                                                              #
################################################################################

proc collect_blade_power_state { blade } {
    global amm_id amm 

    send -i $amm_id "info -T blade\[$blade\]\r"
}

################################################################################
#                                                                              #
#  Collect a file from the AMM.                                                #
#                                                                              #
################################################################################

proc collect_file_from_amm { remote_directory filename local_copy } {
    global amm userid password

    set command "/usr/bin/curl"
    set arg1 "--silent"
    set arg2 "--user"
    set arg3 "${userid}:${password}"
    set arg4 "ftp://${amm}/${remote_directory}/${filename}"
    set arg5 "-o"
    set arg6 "${local_copy}"

    set run_command [list exec $command $arg1 $arg2 $arg3 $arg4 $arg5 $arg6]

    if {[catch $run_command result]} {
        ft_log "Curl: command crashed with result $result fetching $arg4"
        ft_log "Curl: The command was: ($command $arg1 $arg2 $arg3 $arg4 $arg5 $arg6"
        return 1
    }

    return 0
}

################################################################################
#                                                                              #
#  Delete a file from the AMM.                                                 #
#                                                                              #
################################################################################

proc delete_file_from_amm { filename } {
    global amm_id

    send -i $amm_id "files -d ${filename}\r"

    expect -i $amm_id "OK" {
        return 0
    }
   
    return 1 
}

################################################################################
#                                                                              #
#  Code from source      : support/include/MultiNode.exp                       #
#                                                                              #
################################################################################

set multinode_complex [ list ]

#############################################################################
#                                                                           #
#  Build a list of complexes.                                               #
#                                                                           #
#############################################################################

proc populate_complex_list { } { 
    global amm_id multinode_complex

    set multinode_entry [ list ]

    send -i $amm_id "scale\r\n"

    expect {

         #
         #  Find the complex ID.
         #

        -i $amm_id "Complex ID:" {

            expect -i $amm_id "\n" {
                 set temp $expect_out(buffer)
                 set complex [ string trimright $temp ]
                 set complex [ string range ${complex} 1 4 ]

                 set multinode_entry [ list ]
                 lappend multinode_entry ${complex}
            } timeout {
                ft_error "parse error"
            }

            exp_continue
        }

         #
         #  Find the slots.
         #

        -i $amm_id "Bay: " {

            expect -i $amm_id "\n" {
                 set temp $expect_out(buffer)
                 set bay [ string trimright $temp ]
                 set bay [ string range ${bay} 0 [string length ${bay}]]

                 lappend multinode_entry ${bay}

            } timeout {
                ft_error "parse error"
            }

            exp_continue
         }

        -i $amm_id "No scalable complex found" {
            ft_log "AMM: No multi nodes found."
        }
    }

    lappend multinode_complex $multinode_entry
}

#############################################################################
#                                                                           #
#  Return a list element for a given slot configuration.                    #
#                                                                           #
#############################################################################

proc get_multinode_list_for_slot { slot_no } {
    global multinode_complex

    set empty [ list ]
    set temp [ list ]

    #
    #  Return an empty list if the multinode complex is has nothing.
    #

    set count [ llength $multinode_complex ]

    if {$count == 0 } {
        return $empty
    }

    #
    #  Search each list in the multinode complex list.
    #

    foreach temp $multinode_complex {

        #
        #  Now seach the sublist.
        #

        foreach temp1 $temp {
            if {$temp1 == $slot_no} {
                return $temp
            }
        }
    } 

    return $empty
}

#############################################################################
#                                                                           #
#  Send the update command for the blade.                                   #
#                                                                           #
#############################################################################

proc flash_update_mn_blade { blade_no firmware_image } {
    global amm_id tftp_server g_normal_timeout_value

    #
    #  Tell the AMM no timeout.
    #

    send -i $amm_id "telnetcfg -t 0\r\n"
    expect -i $amm_id -exact "OK"

    #
    #  Populate a list of multi node targets.
    #

    set slots [ list ]
    set slots [ get_multinode_list_for_slot $blade_no ]

    #
    #  Validate it has data.
    #

    set count [ llength $slots ]

    if {$count == 0} {
        ft_error "Unable to find any valid multi node configuration."
        return 1 
    }

    #
    #  Get a big timeout value while we flash.
    #

    set timeout 1000

    set complex_name [ lindex $slots 0 ]

    ft_log "Attempting to flash complex: $complex_name"

    #
    #  Flash each slot number.
    #

    foreach slot $slots {
        if {$slot == $complex_name} {
            continue
        }

        ft_log "Flashing slot number: $slot"

        #
        #  Send the update command.
        #

        send -i $amm_id "update -i $tftp_server -l $firmware_image -T system:blade\[$slot\]:sp\r\n"

        #
        #  Process results.
        #

        set rv 1

        expect {
            -i $amm_id "successful" { set rv 0 }
            -i $amm_id "meant"      { set rv 1 }
            -i $amm_id "failed"     { set rv 1 }
            -i $amm_id "*nable*"    { set rv 1 }
        }

        if {$rv == 0} {
            ft_log "AMM reports flash success for slot $slot"
        } else {
            return ${rv}
        }
    }

    #
    #  Restore the timeout and return the rv.
    #

    set timeout $g_normal_timeout_value

    return 0
}

################################################################################
#                                                                              #
#  Code from source      : support/include/FlashBlade.exp                      #
#                                                                              #
################################################################################

################################################################################
#                                                                              #
#  Sometimes the AMM leaves old UPD files hanging around.                      #
#                                                                              #
################################################################################

proc purge_old_upd_files { } {
    global amm_id

    #
    #  AMM53 series introduced a strange behaviour that needs to 
    #  be investigated but can be worked around with a delay.
    #

    sleep 20

    #
    #  Look for stale files.
    #

    send -i $amm_id "files -T system:mm\[1\]\r\n"

    expect {
        -i $amm_id "Available:" { return }

        -i $amm_id "volatile/*.upd*" {
    
            puts "\n\n Must delete: $expect_out(buffer)\n\n" 
            return
        }
    }
}

#############################################################################
#                                                                           #
#  Send the update command for the blade.                                   #
#                                                                           #
#############################################################################

proc flash_update_blade { blade_no firmware_image } {
    global amm_id tftp_server g_normal_timeout_value

    #
    #  Get a big timeout value while we flash.
    #

    set timeout 1000

    #
    #  Make sure the AMM knows too.
    #

    send -i $amm_id "telnetcfg -t 0\r\n"
    expect -i $amm_id -exact "OK"

    #
    #  Populate a list 
    #


    #
    #  Send the update command.
    #

    send -i $amm_id "update -i $tftp_server -l $firmware_image -T system:blade\[$blade_no\]:sp\r\n"

    #
    #  Process results.
    #

    set rv 1

    expect {
        -i $amm_id "successful" { set rv 0 }
        -i $amm_id "meant"      { set rv 1 }
        -i $amm_id "failed"     { set rv 1 }
        -i $amm_id "*nable*"    { set rv 1 }
    }

    #
    #  Restore the timeout and return the rv.
    #

    set timeout $g_normal_timeout_value

    return $rv
}

#############################################################################
#                                                                           #
#  This loop will flash all blades in a given chassis to a given level of   #
#  IMM or uEFI firmware via the AMM.                                        #
#                                                                           #
#############################################################################

proc flash_all_blades { firmware } {

    global blade_presence_bits

    for {set slot 1} {$slot < 15} {incr slot 1} {

        if { ! [info exists blade_presence_bits($slot)]} {
            continue
        }

        set present $blade_presence_bits($slot)

        if {$present == 1} {

            ft_log "Blade: $slot -- Updating to firmware: $firmware."

            set rv [ flash_update_blade $slot $firmware ]

            if {$rv == 0} {
                ft_log "Blade: $slot -- Firmware update success."
            } else {
                ft_log "Blade: $slot -- Firmware update failed."
                set rv [ collect_flash_failure_logs $slot ]
            }

            global amm_id
            send -i $amm_id "\r"
            set rv [ ft_delay 2 ]
        }
    }
}



#############################################################################
#                                                                           #
#  Script startup -- check usage and assign globals.                        #
#                                                                           #
#############################################################################

if {$argc < 6} {
    puts "USAGE: $argv0 <Chassis_Ip> <Userid> <Password> <TftpServer> <TftpFilename> <Blade_No>"
    exit
}

set amm           [lindex $argv 0]
set userid        [lindex $argv 1]
set password      [lindex $argv 2]
set tftp_server   [lindex $argv 3]
set tftp_filename [lindex $argv 4]
set blade_no      [lindex $argv 5]

#############################################################################
#                                                                           #
#  Code start.                                                              #
#                                                                           #
#############################################################################

set rv [ log_init 1 ]
set rv [ amm_login $amm $userid $password ]
set rv [ purge_old_upd_files ]
set rv [ populate_complex_list ]
set rv [ amm_set_mm_target ]
set rv [ flash_update_mn_blade $blade_no $tftp_filename ]

#
#  Display user output data.
#

if {$rv == 0} {
    ft_log "FlashStatusOut: success"
} else {
    ft_log "FlashStatusOut: failure"
}

exit