OpenClinica User Manual/OfflineDistributions

Using Python Script to Create Subjects and Schedule OpenClinica Events edit

DNDi Data Center in Nairobi, Kenya has been exploring ways of using OpenClinica in “offline mode” and has developed a python Script which makes it possible to use the software in offline mode then later synchronize with a central database. The script is meant to work in an environment which is organized as follows;


1. First the Central database is setup on a main study server. Create the final version of the study database and define study sites and site users. This becomes the main or central study database and in this case is hosted at the DNDi Data Center Server in Nairobi. When this is done, we use the final study database dump to replicate the database on all site computers.


2. Proceed to collect data at the sites Once the database is installed on the site computers, the site users can proceed with data entry normally. No internet connection is needed since OpenClinica is locally installed on the site computers. Data is then periodically extracted from the study computers in CDISC ODM 1.3 Xml format and sent to DNDi Data Center in Nairobi for import into the Central database. Latest database dump is also generated and sent to Data Center.


3. Synchronization with the Central Study database at the Data Center. When the Data Center receives both the extracted data and the latest database dump, the latest database dump is plugged into Postgres database on the same server as the Central study database. This means that a database “using site name” is created on the same server as central database and site database dump restored on it. A python (see http://www.python.org) script which is referred to as OC Event Scheduler is executed. The script compares the Central database with the plugged-in site database, checking whether the subjects available in the site database are also available in the Central database. If the subjects are not available in the central database then they are created in the central database. The Script also checks whether the subjects events from the site database have been scheduled in the Central database, and if not, then the events are scheduled. Once new subjects are added and all events scheduled, then the extracted site data is then imported using OpenClinica data import functionality. The data import can either be done manually or an import task can be scheduled such that OpenClinica will then import all the ODM files posted on the import folder as scheduled.


OC Event Scheduler is a generic tool which is only “supplied” with the central database name, site database name, database username, database host address, and the database password in order to be used to synchronize the site and the central database.

Python Script Installation

To execute the script, the following software must be installed;

1. Apart from standard OpenClinica software stack, Python Software (from http://www.python.org/download/releases/2.7.2/) must be installed in order to execute the OC Event Scheduler script.

2. Psycopg2 (PostgreSQL database adapter for the Python programming language) is available for download from http://initd.org/pub/software/psycopg2 and also http://www.stickpeople.com/projects/python/win-psycopg/ for Windows.

Additional details on how the script works is available on the Script as comments.

the python script edit

import csv
import psycopg2
"""
@Author: Michael Ochieng
        Data Manager/ Programmer
        DNDi Africa Liason Office
        Nairobi, Kenya.
        Pythons Script For Bulk Scheduling of OpenClinica Events in the Central Database
        - In preparation of Bulk Import from a site database.
       ------------------------------------------------------------------------------------------------
       Assumptions
       ------------------------------------------------------------------------------------------------
       1. The Site Database Must be a replica of the Central Database.
       2. The Site database backup dump must be plugged into the same server containing the central database
"""

def __getDBConnection(central_dbname,site_dbname,username,host,passwd):
    try:

        """Connection to site database i.e the database to import data from"""
        conn = psycopg2.connect("dbname='"+site_dbname+"' user='"+username+"' host='"+host+"' password='"+passwd+"'")
        cur=conn.cursor()
        cur3=conn.cursor()
        
        """Conncetion to the Main Study Database i.e the database to import data to"""
        conn2 = psycopg2.connect("dbname='"+central_dbname+"' user='"+username+"' host='"+host+"' password='"+passwd+"'")
        cur2=conn2.cursor()   

        """
        1. We begin by importing the subject from OpenClinica subject table.
        First, select all the study_subjects from the site study database so the we can
        compare with study_subjects from the Main study database whether they are already
        in the system or not.
        """

        print "------------Importing Subject and Study_Subject data---------------------- "
        cur.execute("""select * from study_subject;""")
        study_subject_rows = cur.fetchall()
        the_subject_id=0
        if study_subject_rows:
            for row in study_subject_rows:
                cur2.execute("""select * from study_subject where label='"""+row[1]+"""'""")
                site_study_subject_rows=cur2.fetchall()

                """
                Meaning that if the subject with the label passed in the query above
                is not present, then go ahead and add the subject in the main database.
                """
                if len(site_study_subject_rows)==0:
                    
                    cur3.execute("""select * from subject where subject_id=(select subject_id from study_subject where label='"""+row[1]+"""');""")                    
                    rows3=cur3.fetchall()
                    
                    if rows3:
                        for row3 in rows3:
                            
                            print row3
                            subjectStr="INSERT INTO subject(subject_id, father_id, mother_id, status_id, date_of_birth, gender, unique_identifier, date_created,"
                            subjectStr+="owner_id, date_updated, update_id,dob_collected) VALUES (nextval('subject_subject_id_seq'),%s, %s, %s,"

                            subjectStr=subjectStr % ('null','null',row3[3])

                            #Check if Date of Birth is Null
                            if row3[4] is None:
                                subjectStr+="null,"
                            else:
                                subjectStr+="'%s',"                                                                
                                subjectStr=subjectStr % (row3[4])

                            #Check if Gender is Null
                            if row3[5] is None:

                                subjectStr+="null,"
                            else:
                                subjectStr+="'%s',"                                                                
                                subjectStr=subjectStr % (row3[5])
                                
                            subjectStr+=" '%s', '%s', %s, %s,%s, '%s');"
                            
                            subjectStr=subjectStr % (row3[6],row3[7],row3[8],'null','null',row3[11])

                            cur2.execute(subjectStr)
                            conn2.commit()
                        
                    strSql="INSERT INTO study_subject(study_subject_id, label, secondary_label, subject_id, study_id,status_id, enrollment_date,"
                    strSql+="date_created, date_updated, owner_id, update_id,oc_oid) VALUES (nextval('study_subject_study_subject_id_seq'),'%s',"
                    strSql+="'%s', currval('subject_subject_id_seq'), %s, %s, '%s', '%s', %s, %s,%s,'%s');"

                    strSql = strSql % (row[1],row[2],row[4],row[5],row[6],row[7],'null',row[9],'null',row[11])
                    cur2.execute(strSql)
                    conn2.commit()

        cur2.execute("""select study_subject_id,label from study_subject;""")
        ss_rows=cur2.fetchall()

        if ss_rows:
            for ss_row in ss_rows:
                print "------------Importing Study_Event data----------------------Subject ID: ", ss_row[0]
                cur.execute("""select * from study_event where study_subject_id=(select study_subject_id from study_subject where label='"""+str(ss_row[1])+"""');""")
                site_study_event_rows=cur.fetchall()
                if site_study_event_rows:
                    for i in site_study_event_rows:
                        selEventStr="select * from study_event where study_subject_id=(select study_subject_id from study_subject where label='%s') and study_event_definition_id=%s and sample_ordinal=%s;"
                        selEventStr=selEventStr %(str(ss_row[1]),str(i[1]),str(i[4]))

                        #print selEventStr
                        cur2.execute(selEventStr)
                        maindb_study_event_rows=cur2.fetchall()
                       
                        if len(maindb_study_event_rows)==0:
                            
                            eventStr="INSERT INTO study_event(study_event_id, study_event_definition_id, study_subject_id,location, sample_ordinal,"
                            eventStr+="date_start, date_end, owner_id, status_id,date_created, date_updated, update_id, subject_event_status_id,"
                            eventStr+="start_time_flag, end_time_flag) VALUES (nextval('study_event_study_event_id_seq'),%s,%s,'%s',%s,%s,%s,%s,%s,%s,%s,%s,1,'%s','%s');"

                            eventStr=eventStr %(i[1],ss_row[0],'null' if i[3].find('None')!=-1 else i[3],i[4],'null' if str(i[5]).find('None')!=-1 else "'"+str(i[5])+"'",'null' if str(i[6]).find('None')!=-1 else "'"+str(i[6])+"'",i[7],i[8],'null' if str(i[9]).find('None')!=-1 else "'"+str(i[9])+"'",'null' if str(i[10]).find('None')!=-1 else "'"+str(i[10])+"'",'null' if str(i[11]).find('None')!=-1 else i[11],i[13],i[14])
                            #print eventStr
                            cur2.execute(eventStr)
                            conn2.commit()
        cur.close()
        conn.close()
                

    except psycopg2.DatabaseError, e:
        print 'Error %s' % e    
        #return null

"""
@CentralDatabaseName: refers to the Central OC Database Name
@SiteDatabaseName: Refers to the database name for the site database as restored on the main server
@Username: Database user assigned to both databases
@localhostOrServeIP: The server IP Address or localhost if the script is executed from the server
@YourPassword: Refers to the user password corresponding to the @Username
Assumption
----------
1. The Same database login role is used when creating the CentralDatabase and the Site Database.

"""

__getDBConnection('CentralDatabaseName','SiteDatabaseName','UserName','localhostOrServeIP','YourPassword')