You are here: TWiki > Accounting Web>GratiaUpgradeProcedure (11 Dec 2008, KyleGross)

Gratia Upgrade Procedure

Database privileges:

  • ./post-install.sh summary-view
    • forces the update of the VIEWS. There is a DEFINER= parameter in the VOProbeSummay view which contains the original database/collector used. This can be reset by re-loading the views.

Gratia Upgrade Procedure

Overview

The intent is to minimize downtime (availability) of Gratia services. This is a general outline and the details for each step will be addressed in a subsequent section. This will also serve as a trial/template for future conversions where DB changes may require long conversion times.

Setup the temporary gratiax23 tomcat and gratia07 database

  1. Install the current production version (v0.32.1b) of Gratia on a test machine (gratiax23)
    1. To avoid complications and additional work, it is best if this is an instance that has no probes reporting to it.
    2. Turn off the MySql instance to reduce resource consumption
      service mysql5 stop
      
    3. We are only using the tomcat web services and will be pointing it to the gratia06 to effect the conversion.
    4. To be safe, turn off the update services: on the gratia-administration screen,
      • go to System-->Administration,
      • scroll down and click on the Stop Update Services link,
      • the Current Status should show Stopped
    5. Remember that every time you restart the tomcat service, the Gratia collector comes up in Update mode which will cause problems since we are converting the real database. This is why using an instance that is assured of having no probes reporting to it is very important.

  2. We are going to be switching the gratia09:/data/tomcat-gratia collector/reporter to use the gratia07 database while the conversion takes place on gratia06. So, to effect this, we need to do the following:
    1. Rather than take the gratia06 database off-line to do a backup to gratia07, we will use the nightly backup (currently a mysqlhotcopy).
    2. On gratia07:
      • The night before, remove all files under the /data/mysqldb/gratia directory and set the ownership on the directory to gratia.mysql.
      • Also the night before, verify/set the users and privileges so the gratiax23 has permissions to update the stored procedures:
        GRANT ALL ON *.* TO 'root'@'<your collector ip address>  identified by 'root password';
        GRANT GRANT OPTION ON *.* TO 'root'@'<your collector ip address> 
        ...note you can find the IP address by executing...
           host gratiax23.fnal.gov
        
        Also verify that the gratia06 has the same root permissions for it's specific IP address. If it doesn't, set them as above.

      • On the day of cutover, verify the ownership on the /data/mysqldb/gratia directory and the ownership on the table files are gratia.mysql. They will likely have been reset from the nightly backup.

      • If replication is in use, we need to turn it off temporarily and also be in a position to turn it back on once we have converted to using the gratia07 database:
        ... this creates a file to be used when turning Replication back on .... 
        select "update Replication set running=1 where replicationid=",replicationid,";" 
        from Replication where running=1 into outfile '/tmp/jgw.sql';
        ... be sure that the outfile does not already exist or it will fail
        
        ... then turn replication off for those ids
        update Replication set running=0 where running=1;
        

      • Update the SystemProplist table for the attributes below so the stored procedures, triggers and table statistics get updated correctly.
        update SystemProplist set cdr=0 where car='gratia.database.storedProcedureVersion';
        update SystemProplist set cdr=0 where car='gratia.database.summaryTriggerVersion';
        update SystemProplist set cdr=0 where car='gratia.database.TableStatisticsVersion';
        
        select * from SystemProplist;  
        
        +--------+----------------------------------------+------+
        | propid | car                                    | cdr  |
        +--------+----------------------------------------+------+
        |    211 | use.report.authentication              | true | 
        |    212 | gratia.database.version                | 28   | 
        |    214 | gratia.database.summaryTableVersion    | 26   | 
        |    215 | gratia.database.summaryTriggerVersion  |  0   | 
        |    216 | gratia.database.storedProcedureVersion |  0   | 
        |    217 | gratia.database.wantSummaryTable       | 1    | 
        |    218 | gratia.database.wantSummaryTrigger     | 1    | 
        |    219 | gratia.database.wantStoredProcedures   | 1    | 
        |    220 | gratia.database.TableStatisticsVersion |  0   | 
        +--------+----------------------------------------+------+
        

    3. On gratiax23:VDT_LOCATION/tomcat/v55:
      • change the ./gratia/service-configuration.properties file entries to use the gratia07 database
        service.mysql.url=jdbc:mysql://gratia07.fnal.gov:3320/gratia
        ... use the values from the gratia09:/data/tomcat-gratia instance....
        service.mysql.user=xxxxx
        service.mysql.password=xxxxx
        service.reporting.user=xxxxx
        service.reporting.password=xxxxx
        
      • Restart the tomcat service. It's also not a bad idea to remove the log files so it is easier to spot any problems.
        service tomcat-55 stop
        rm -f VDT_LOCATION/tomcat/v55/logs/*
        service tomcat-55 start
        
        ... since there should be no updates occurring, nor any replication 
                 you should see only this line in the catalina.out file and
                 no other activity
        INFO: Server startup in 14307 ms
        
      • A couple of things to look for in the logs (other than obvious stacktrace errors) is any failures in updating the triggers, procedures, summary tables, or summary views:
         grep -i fail ./logs/*  
        
        grep "Summary trigger updated successfully" *
        grep "Stored procedures updated successfully" *
        grep "Summary tables updated successfully" *
        grep "Summary view updated successfully" *
        grep "Table statistics updated successfully" *
        
        Another item to look for to verify you have connected to the right database would be to re-query the SystemProplist table on gratia07, checking that the cdr value changed on those set to 0.
        select * from SystemProplist;

    4. Verify you are connected correctly by bringing up the administration (http://gratiax23.fnal.gov:8880/gratia-administration) page in your browser. The counts shown and those on the http://gratia09.fnal.gov:8880/gratia-administration page should be reasonably close with the only difference being the data collected since the mysqlhotcopy was performed.

    5. We need to create the static report files
      • Execute the following (as root):
        /usr/local/osg-collector/tomcat/v55/gratia/staticReports.py \
          /usr/local/osg-collector/tomcat/v55 \
          http://gratiax23.fnal.gov:8880/gratia-reporting
        
      • Bring up the gratia reporting
        - http://gratiax23.fnal.gov:8880/gratia-reporting

      • Click on the Static Reports menu item and then verify each report comes up.

Bring the gratia07 into sync with the gratia06

We now have to "catch up" the gratia07 database before switching the gratia09 collector to use it.
  1. On gratia07, find the last dbid for the JobUsageRecord table.
    select max(dbid) from JobUsageRecord;
    +-----------+
    | max(dbid) |
    +-----------+
    |   8049804 | 
    +-----------+
    

  2. On http://gratia09.fnal.gov:8880/gratia-administration, add a replication entry pointing to http://gratiax23.fnal.gov:8880 and starting with the dbid from the previous step +1. Be sure to test it..*Do not activate it yet.*

  3. In the gratia06 gratia database, set the dbid in the Replication table to the 8049804.
    select * from Replication where openconnection like '%gratiax23%';
    update Replication set dbid=8049804  where openconnection like '%gratiax23%';
    

  4. Now back on the http://gratia09.fnal.gov:8880/gratia-administration replication screen, refresh the screen and verify the dbid value.

  5. If all is well, click the Start button. Then watch for a while in the gratiax23:VDT_LOCATION/tomcat/v55 logs files for any errors. You can also watch on the http://gratiax23.fnal.gov:8880/gratia-administration Status screen.

  6. On http://gratia09.fnal.gov:8880/gratia-administration, go to the System --> Administration page and then to Starting/Stopping Database Update Services section and select the Stop Update Services link. This will queue any updates and allow the two instances to sink without losing any data.

  7. WAIT UNTIL the All Time Total numbers are equal for both http://gratia09.fnal.gov:8880 and http://gratiax23.fnal.gov:8880. Then we are caught up and can proceed.

  8. On the http://gratia09.fnal.gov:8880/gratia-administration replication screen, Stop the replication for http://gratiax23.fnal.gov:8880.

  9. On the http://gratia09.fnal.gov:8880/gratia-administration replication screen, Delete the replication for http://gratiax23.fnal.gov:8880.

  10. On gratiax23, stop the tomcat service. This positions us to switch the gratia09:/data/tomcat-gratia tomcat service to using the gratia07 database.
    service tomcat-55 stop

Switch the gratia09 tomcat to use the gratia07 database

On gratia09, switch the tomcat instance using the gratia06 database to the backup on gratia07. The gratia07 database will be used for for production until the conversion is complete on gratia06.

This will be the first downtime period for the gratia Gratia service.

  1. Stop the tomcat-gratia
    service tomcat-gratia stop

  2. Edit the gratia09:/data/tomcat-gratia/gratia/service-configuration.properties file changing the service.mysql.url attribute to reference the gratia07 database.
    service.mysql.url=jdbc:mysql://gratia07.fnal.gov:3320/gratia 

  3. Start the tomcat-gratia
    service tomcat-gratia service
    It would probably be a good idea to tail the log files for a period to verify all is well.

Effect the conversion using gratiax23 tomcat and gratia06 database

  1. On gratiax23, the tomcat instance should have already been stopped. If not, do it quickly.
    service tomcat-55 stop
    

  2. On gratiax23, edit the gratiax23:VDT_LOCATION/tomcat/v55/gratia/service-configuration.properties file changing the service.mysql.url attribute to reference the gratia06 database.
    service.mysql.url=jdbc:mysql://gratia06.fnal.gov:3320/gratia 
    It is very important that this step be performed or the result will most likely be a mess.

  3. On gratiax23, edit the gratiax23:VDT_LOCATION/tomcat/v55/gratia/hibernate.cfg.xml file changing:
             <!-- SQL dialect -->
    from. <property name="dialect">org.hibernate.dialect.MySQLDialect</property>
    to.. <property name="dialect">org.hibernate.dialect.MySQLInnoDBDialect</property>
    

  4. On gratia06, verify/set the users and privileges so the gratiax23 has permissions to update the stored procedures:
    GRANT ALL ON *.* TO 'root'@'<your collector ip address>  identified by 'root password';
    GRANT GRANT OPTION ON *.* TO 'root'@'<your collector ip address> 
    ...note you can find the IP address by executing...
       host gratiax23.fnal.gov
    

  5. On gratiax23, start the gratia tomcat services:
    service tomcat-55 start
    

  6. On gratiax23, when the tomcat service initializes, it will detect any schema changes have been effected and a conversion process will begin.
    1. tail the catalina.out log
    2. When the conversion process completes the log will show the following message:
      INFO: Server startup in _xxx_ms

  7. Wait until the previous step completes before proceeding.

Bring the gratia06 into sync with the gratia07

Before we switch the gratia09 tomcat service back to using the gratia06, we have to do a "catch up" on the data the gratia07 has been collecting while the conversion was being performed.

  1. On gratia06, find the last dbid for the JobUsageRecord table.
    select max(dbid) from JobUsageRecord;
    +-----------+
    | max(dbid) |
    +-----------+
    |   8054381 | 
    +-----------+
    

  2. On http://gratia09.fnal.gov:8880/gratia-administration, add a replication entry pointing to http://gratiax23.fnal.gov:8880 and starting with the dbid from the previous step +1. Be sure to test it..*Do not activate it yet.*

  3. On gratia07 in the gratia database, set the dbid in the Replication table to the 8054381.
    select * from Replication where openconnection = 'http://gratiax23.fnal.gov:8880';
    update Replication set dbid=8054381  where openconnection = 'http://gratiax23.fnal.gov:8880';
    

  4. Now back on the http://gratia09.fnal.gov:8880/gratia-administration replication screen, refresh the screen and verify the dbid value.

  5. If all is well, click the Start button. Then watch for a while in the gratiax23:VDT_LOCATION/tomcat/v55 logs files for any errors. You can also watch on the http://gratiax23.fnal.gov:8880/gratia-administration Status screen.

  6. On http://gratia09.fnal.gov:8880/gratia-administration, go to the System --> Administration page and then to Starting/Stopping Database Update Services section and select the Stop Update Services link. This will queue any updates and allow the two instances to sink without losing any data.

  7. WAIT UNTIL the All Time Total numbers are equal for both http://gratia09.fnal.gov:8880 and http://gratiax23.fnal.gov:8880. Then we are caught up and can proceed.

  8. On the http://gratia09.fnal.gov:8880/gratia-administration replication screen, Stop the replication for http://gratiax23.fnal.gov:8880.

  9. On the http://gratia09.fnal.gov:8880/gratia-administration replication screen, Delete the replication for http://gratiax23.fnal.gov:8880.

  10. On gratiax23, stop the tomcat service. This positions us to switch the gratia09:/data/tomcat-gratia tomcat service to using the gratia06 database.
    service tomcat-55 stop

Switch gratia09 tomcat back to the gratia06 database

  1. On gratia09, stop the tomcat service for gratia.
    service tomcat-gratia stop
    

  2. On gratia09, edit the gratia09:/data/tomcat-gratia/gratia/service-configuration.properties file changing the service.mysql.url attribute to reference the gratia06 database.
    service.mysql.url=jdbc:mysql://gratia06.fnal.gov:3320/gratia 
    It is very important that this step be performed or the result will most likely be a mess.

  3. On gratia09, edit the gratia09:/data/tomcat-gratia/gratia/hibernate.cfg.xml file changing:
             <!-- SQL dialect -->
    from. <property name="dialect">org.hibernate.dialect.MySQLDialect</property>
    to.. <property name="dialect">org.hibernate.dialect.MySQLInnoDBDialect</property>
    

  4. On gratia09, start the gratia tomcat services:
    service tomcat-gratia start
    

  5. On gratia09, when the tomcat service initializes, it will detect any schema changes have been effected and a conversion process will begin.
    1. tail the catalina.out log
    2. When the conversion process completes the log will show the following message:
      INFO: Server startup in _xxx_ms

  6. WE ARE DONE AND CAN BREATH AGAIN!!!!


Major updates

-- JohnWeigand - 10 Mar 2008
Topic revision: r12 - 11 Dec 2008 - 18:51:44 - KyleGross
Hello, TWikiGuest
Register

 
TWIKI.NET

TWiki | Report Bugs | Privacy Policy

This site is powered by the TWiki collaboration platformCopyright by the contributing authors. All material on this collaboration platform is the property of the contributing authors..