Platform Analytics 7 Dataflow

Previous Topic Next Topic Index

WI_JOBMART_HPC table

This describes the wi_jobmart_hpc table and how each column of data arises from raw data table.

The wi_jobmart_hpc get its data from the LSB_EVENTS raw data table.

We just retrieve the "JOB_FINISH" events records from LSB_EVENTS table.

This is the column description of each data column of WI_JOBMART_HPC_HPC and how each column is filled with data.

Column Name

Description

Key

CLUSTER_CODE

This comes from the CLUSTER_NAME field in the raw table. Once we get the cluster_name, we then look it up in the wi_clustercode table to see if we have already has a record of it in there. If we do, then we'll get the code back, otherwise, we will insert it into the wi_clustercode table and generate the code. The code itself is a positive integer and each new code is equal to the maximum of the existing sequence+1.

Primary key

QUEUE_TIME

This comes from the SUBMIT_TIME field in the raw table. It is in GMT timezone.

Primary key

JOB_ID

This comes from the JOB_ID field in the raw table.

Primary key

JOB_ARRAY_INDEX

This comes from the JOB_ARRAY_INDEX field in the raw table.

Primary key

HPC_ARRAY_INDEX

identify the index of per host in the execution host lists

Primary key

START_TIME

This comes from the START_TIME field in the raw table. It is in GMT timezone.

 

FINISH_TIME

This comes from the END_TIME field from the raw table. It is in GMT timezone.

Primary key

NUM_PROCESS

This comes from the NUM_PROCESSORS field in the raw table.

 

JOB_NAME

This comes from the JOB_NAME field in the raw table. If the job name is longer then 64 characters then we will trim it down to 64 characters.

 

USER_CODE

This comes from the USER_NAME

Field in the raw table.

Once we get the user name, we then look it up in the wi_usercode table to see if we have already has a record of it in there. If we do, then we'll get the code back, otherwise, we will insert it into the wi_usercode table and generate the code. The code itself is a positive integer and each new code is equal to the maximum of the existing sequence number +1.

 

EXECHOST_CODE

This comes from the EXEC_HOSTS field in the LSB_EVENTS table . If this is null, then set this field to "-". Also if it contains a list of host, we'll only use the first one for this field.

Once we get the host name, we then look it up in the wi_hostcode table to see if we have already has a record of it in there. If we do, then we'll get the code back, otherwise, we will insert it into the wi_hostcode table and generate the code. The code itself is a positive integer and each new code is equal to the maximum of the existing sequence number +1.

 

SUBHOST_CODE

This comes from the FROM_HOST

Field in the LSB_EVENTS table. If this is null, then set this field to "-".

Once we get the host name, we then look it up in the wi_hostcode table to see if we have already has a record of it in there. If we do, then we'll get the code back, otherwise, we will insert it into the wi_hostcode table and generate the code. The code itself is a positive integer and each new code is equal to the maximum of the existing sequence number +1.

 

JOB_CMD

This comes from the COMMAND

field in the raw table. We will trim it down to 128 characters if it is longer then that.

 

QUEUE_CODE

This comes from the QUEUE

field in the raw table. Once we get the queue name, we then look it up in the wi_queuecode table to see if we have already has a record of it in there. If we do, then we'll get the code back, otherwise, we will insert it into the wi_queuecode table and generate the code. The code itself is a positive integer and each new code is equal to the maximum of the existing sequence number +1.

 

PROJECT_CODE

This comes from the PROJECT_NAME

field in the raw table. If this is null, then set this field to "-".

Once we get the host name, we then look it up in the wi_projectcode table to see if we have already has a record of it in there. If we do, then we'll get the code back, otherwise, we will insert it into the wi_projectcode table and generate the code. The code itself is a positive integer and each new code is equal to the maximum of the existing sequence number +1.

 

JOB_TYPE

This comes from two fields of the raw table. The first field comes from " NUM_PROCESSORS " subfield and the second field comes from " OPTIONS ".

If " NUM_PROCESSORS " >1 then set parallel = "true";

If ((" OPTIONS " & 0x4000000)==0x4000000) then set interactive = "true";

If ((parallel == true and interactive!= true)) then set job_type = "10" meaning parallel.

Else if ((parallel != true and interactive == true)) then set job_type = "01" meaning interactive

Else if ((parallel == true and interactive == true)) then set type = "11" meaning interactive and parallel

Else set type = "00" meaning normal batch job type

 

NUM_EXEC_PROCS

This comes from the NUM_EXEC_HOSTS

field in the raw table.

 

JOB_EXIT_STATUS

If "jStatus" == 32 then set this at "EXIT"

ELSE IF "jStatus" == 64 then set this at "DONE"

ELSE set this at "UNKNOWN"

 

JOB_EXIT_CODE

This comes from the EXIT_STATUS field in the raw table.

 

RUN_TIME

This is calculated as the time difference between the finish_time and start_time. The result is in minute.

 

PENDING_TIME

This is calculated as start_time - queue_time if start_time is not null else it is calculated as finish_time - queue_time. The result is in minute.

 

CPU_TIME

The cpu_time is based on two fields from raw table. The subfields are "RU_UTIME" and "RU_STIME". The sum of these two fields is cpu_time. It is in seconds.

 

MEM_USAGE

The mem_usage is based on a field from raw table. The subfield is "MAX_RMEM". The field is then divide by 1024. It is in megabytes.

 

SWAP_USAGE

The swap_usage is based on a field from raw table. The subfield is "MAX_RSWAP". The field is then divide by 1024. It is in megabytes.

 

TIMEDIFF_HOUR

This is calculated as the time difference between the GMT time and the local time in hour

 

TIMEDIFF_MIN

This is calculated as the time difference between the GMT time and the local time in minute.

 

EXECHOST_MODEL_CODE

Not used

 

CWD

This comes from the CWD field in the raw table.

 

OUTFILE

This comes from the OUT_FILE field in the raw table.

 

CPU_COST

Set at 0

 

MEM_COST

Set at 0

 

SWAP_COST

Set at 0

 

TOTAL_RES_COST

Set at 0

 

USER_GROUP_CODE

If the jobgroup_usergroup process is enabled, then this field to be updated from a process called "updateJobGroup" which is being run when doing the daily aggregation. Otherwise, this field is not used. The "updateJobGroup" process determines the proper user group for this job during the job run time. It tried to lookup the jobgroup_code in the wi_jobgroup table based on the JobID, JobArrayIndex and cluster_code. If the jobgroup = 'Others', then its jobgroup_code is 1. It also looks up the more generic usergroup_code and the number of usergroup this user belongs to (groupCount) in the wi_usergroupcode table based on the user.

Also looked up the all_usergroupcode from wi_usergroupconf based on the username of 'all' and the start_time of the job (use finish_time if jobs has never started due to job exit).

The following determine the user_group:

If the jobgroup_code is null or if it is equal to 1 then

If usergroup_code is not null and all_usergroupcode is not null then

If usergroup_code created time is later then the all_usergroupcode and the dbCount =1 then

Set this field as usergroupcode;

Else if all_usergroup creation time is later then usergroup_code creation time then

Set this field as all_usergroupcode;

Else

Set this field as 1 (means other);

End if;

Else if user_groupcode is not null and dbCount > 1 then

Set this field as 1 (means other);

End if;

If user_groupcode is null then

If all_usergroupcode is null then

Set this field as 1 (means other);

Else

Set this field as all_usergroupcode;

End if;

Else

Set this field as user_groupcode;

End if;

Else

Set this field as jobgroup_code;

End if;

 

APPLICATION_CODE

This is the code of job application tag. This is the result of the lookup of the APPLICATION_TAG in the WI_APPLICATIONCODE table.

 

INSERT_SEQ

This is a system generated sequence number. For each new record inserted, an unique sequence number is being assigned to this column.

 

Previous Topic Next Topic Index