Job Scheduler

<< Click to Display Table of Contents >>

Navigation:  Analysis Setup > Remote computing >

Job Scheduler

Job Scheduler is for administrators to manage all compute nodes, all jobs submitted to the Job Scheduler Service, and to change the settings. This is an interface for administrators.

 

 

Open Job Scheduler

Go to Programs in Start menu and click Moldex3D or Moldex3D folder. Open remote computing folder and then click Moldex3D Job Scheduler (You need administrator’s privilege to launch Moldex3D Job Scheduler). The status of Moldex3D Job Scheduler Service will be checked automatically right after launching Job Scheduler as shown below.

 

Use Job Scheduler Interface

This section introduces some main functions on the Job Scheduler interface.

Jobs

Administrators can see the information and current status of submitted jobs as shown below. The status and order of Jobs can be changed by using Requeue, Cancel, Remove, and Move Up/Down buttons. Moreover, jobs can be displayed by status: “All”, “Active”, “Finished”, “Failed” or “Canceled”.

Moreover, administrators can see the information and current status of submitted modules. Just by double-clicking on the selected job, a dialog “module” will pop up as shown below. Administrators also can requeue or cancel the modules by clicking Requeue or Cancel button in dialog named “module”.

Administrators can look up module’s log by double-clicking the selected module.

 

Nodes

Administrators can see the information of the compute nodes in the resource pool, including their IP/Hostname, FQDN (Fully Qualified Domain Name), node health, status, CPU usage and other detailed information of compute nodes.

There are several columns (MPI Service, TCP/IP Registry, VC Redistribution, Windows Firewall, NIC Link Speed) in node tab to help administrators to check parallel setting in each compute node.

The following tables described MPI Service, TCP/IP Registry, VC Redistribution, Windows Firewall status:

MPI Service

Description

The MPI Service is running

MPI Service (smpd.exe) is working.

The MPI Service is not installed

Please reinstall “Parallel Computing Component (Compute node)”

The MPI Service has stopped.

Please start the service “Intel (R) MPI Library (4.0) Process Manager”

n/a

Please make sure the node is activated and network connection is OK, and Parallel Computing Component is installed.

MPI Service

 

TCP/IP Registry

Description

Properly Configured

All TCP/IP Registry are properly configured.

MaxUserPort (set to 65534)

Please set MaxUserPort to 65534

Autodisconnect (set to 1)

Please set Autodisconnect to 1

n/a

Please make sure the node is activated and network connection is OK, and Parallel Computing Component is installed

TCP/IP Registry

 

VC Redistribution

Description

VC2010SP1 is installed

VC Redistribution is OK

VC2010 is installed

VC Redistribution is OK

No VC is installed

Please install VC2010SP1/VC2010 redistribution before starting computing

n/a

Please make sure the node is activated and network connection is OK, and Parallel Computing Component is installed

VC Redistribution

 

Windows Firewall

Description

Windows firewall is not enabled

Windows firewall is disabled.

Properly configured

Windows firewall is enabled and properly configured.

Mdx3DFlow.exe Mdx3DFlowE.exe….are not in the exception list

Please add Mdx3DFlow.exe Mdx3DFlowE.exe…to exeption list

n/a

Please make sure the node is activated and network connection is OK, and Parallel Computing Component is installed.

Windows Firewall

 

Moreover, administrators can see the detailed information by double-clicking the selected compute node.

Administrator can manage the compute node by clicking Add, Online, Offline, Remove, Start, Shutdown or Reboot buttons. Administrator also can manage the compute node’s resource by clicking Force idle or Unforce idle.

 

Add

If administrators want to add a compute node manually into the resource pool, just click Add button as shown below. A dialog “Add Node” will pop up. In the dialog “Add Node”, please input IP/hostname of compute node which you want to add and click Add Button. After that, click Close button to close the dialog.

 

Online/Offline

Administrators can decide whether the compute node in resource pool should participate in computing or not. Select the compute node and then click Online button to make it participate in computing or click Offline button to take it down from computing.

 

Remove

Select the compute node which status is “offline” and then click Remove button as shown below.

 

Start

Administrators can start up a compute node remotely that is in the shut down state through the Job Scheduler. This operation is available only if the node’s health is “unavailable” and node’s status is “Offline”. To enable this feature, you have to configure compute node’s BIOS settings and turn on “Wake On LAN” (or Wake On PCI/PCIE device). Besides, you have to modify settings of “network connection properties” in Windows. Please follow the instructions below:

1.Configure compute node’s BIOS setting

2.In menu “Power”, enable “Wake On LAN” (or Wake On PCI/PCIE device)

3.In menu “Power”, disable “Deep Power Off Mode”

4.In menu “Boot”, modify the boot sequence to make LAN first priority

5.Start up compute node and entering Windows

6.Check “Wake on Magic Packet from power off state” checkbox in Network Connection Properties

 

Shutdown/Reboot

Administrators can shutdown/reboot the specific compute node remotely through the Job Scheduler. Select the specific compute node and then click shutdown/Reboot button to shutdown/reboot it. These operations are available only if the node’s health is “OK” and node’s status is “Offline”.

 

Force Idle/Unforce Idle

Administrators can decide whether the compute node’s logical processor should participate in computing or not. Select a single row in the “node resource” dialog and then click Unforce Idle button to make it to participate in computing or click Force idle button to take it down from computing.

This feature is useful if you don’t want the loading of compute node to be 100% occupied by Moldex3D Jobs. Just set one or two logical processors to forced idle state, and you will have one or two idle logical processors for doing other tasks while Moldex3D Jobs are running on your nodes.

 

Service Operation

Administrators can start/stop Job Scheduler Service by clicking Start Service or Stop Service buttons as shown in following figures.

The button Service Configuration is for administrators to change the settings. This button only works when the service is stopped. The detailed configuration is shown in the following.

 

Windows Data Configuration

In the Windows data configuration, administrators need to input a Windows account and password. Besides, the path of Moldex3D is also required.

 

LM Server Configuration

In the license manager configuration, the server name (or IP address) and port of license manager are required.

 

Portal Node Data Configuration

In the portal node data configuration, administrators need to input server name (or IP address) and port of Job Service. Besides, the account and password which are recorded in RC Account Manager are also required.

 

When clicking Advanced button in Job Scheduler Service Configuration, a dialog named “Advanced Options” will pop up. There are three tabs in the “Advanced Options” dialog. In Service tab, it allows administrators to modify servicing port of Job Scheduler Service. The default port of Job Scheduler Service is 10010. In Analysis tab, if “Auto requeue job” is checked, the original queued/running job will be requeued automatically when the service is restarted. In Node Monitoring tab, it allow administrators to modify time interval (default is 3s) for update node status and showing extra details like CPU usage and free RAM.

 

Message

The messages show in this window record information from operations.