Frequently Asked Questions (FAQ)

How long does it take for jobs to start?

The startup time depends on the current availability of the selected node type, and the number of nodes requested, and the type of storage. Jobs running on virtual machines (VMs) typically start in less than one minute. For bare metal (BM) nodes with limited availability, queue times of several hours are not uncommon. Jobs running on file systems usually start faster than jobs run from spaces.

If you are running jobs on the same file system, you may notice slower startup for the first job your team runs with a given CONVERGE version. Each version has to be installed on the file system the first time you use it, which takes extra time. Subsequent jobs that use the same CONVERGE version will start faster.

Why am I seeing a warning that says Start time will be delayed due to node availability?

This warning appears if you select a node type that has limited availability in the data center where you are running your job. If you proceed to create the job with this node type, the job will remain at a Pending status until nodes become available. We recommend setting a Maximum Queue Time of at least 24 hours to allow ample time for nodes to be allocated.

If you want your job to start sooner, consider these alternatives:

  • Run the job with a different node type that is currently available.

  • Run the job in a different data center where your preferred node type is currently available.

Why did my job fail/why was my job stopped?

If a job fails or stops unexpectedly, first check the exit reason. It is common for jobs to fail with the reason “CONVERGE terminated unexpectedly”, which typically indicates case setup issues. If you understand what the problem is after checking the log file, fix the case setup and create a new job.

Stopped jobs can be restarted from the restart file in the job directory (unless the user who stopped the job intentionally discarded the output files). To restart a job, you must create a new job that is initialized from the appropriate restart file. Detailed instructions for restarting a simulation are available in the CONVERGE Manual on the Convergent Science Hub (login required).

For help troubleshooting issues that you cannot resolve on your own, request support from our team.

Why am I missing output files for a job that was recently completed?

When output files are uploaded to the space at the end of the simulation, they are split into 1 GB pieces and then reassembled when all of the pieces have been uploaded. This process usually takes a few minutes, but has been known to take 30 minutes or more on rare occasions. If you notice that some output files are missing for a job that ran to completion, please wait at least 30 minutes for the output files to appear. If they are still missing after 1 hour, contact horizon@convergecfd.com for assistance.

Why is it taking so long to download files from my space?

Download speeds are influenced by many factors, including but not limited to our cloud providers, your internet service provider (ISP), your local network policies, and the network carriers in between. Because of these variables, CONVERGE Horizon cannot guarantee specific download/upload speeds when handling your data. For example, it is possible for a download of 1 TB over a 100 Mbps connection to take 24 hours or more, under certain circumstances.

Why does it take so long to delete files from a space?

For every file that you delete from a space, an API request is sent from CONVERGE Horizon servers (located in the U.S.) to the data center hosting the space. Deleting a large number of files or a large folder requires many API requests, which can take some time to process. This processing time may depend, in part, on the geographic distance between the data center and CONVERGE Horizon servers.

If you notice that file deletion from a space is consistently slow, consider using a file system instead. No API requests are needed to delete files from a file system.

How can I calculate the cost of a job I’m planning to run?

When you create a new job, CONVERGE Horizon provides a cost estimate based on your organization’s licensing setup and the job options you have selected. You can review this estimate before proceeding with job creation.

If you do not have access to a CONVERGE Horizon account, you can use the cost estimator at convergecfd.com to estimate the cost of a job.

Can I check my input files without starting a simulation?

Yes. You can run the check_inputs utility by specifying a command-line argument for CONVERGE.

If your organization uses on-demand licensing, there is no license charge for running check_inputs.

Where can I get information about CONVERGE Horizon outages?

Visit status.convergecfd.com/status/horizon to check the real-time status of CONVERGE Horizon and related cloud services.

Why did my job use fewer cores than were available for my selected node type?

CONVERGE Horizon uses a default number of cores per node that has been optimized for the selected node type. For certain node types, such as BM-INTEL-ICELAKE-36, running fewer than the total number of cores per node has been shown to optimize performance for a set of benchmark cases.

That said, the optimal number of cores for most simulations may not be the optimal number of cores for your simulation. If necessary, you can override the default value by specifying the number of Cores per Node in Advanced Options when you create a new job.

You can check the number of cores used for a given simulation in the mpi_map.log file.

How do I run chemistry utilities on CONVERGE Horizon?

You can run chemistry utilities and other utilities by passing command-line arguments to the CONVERGE solver.

Can I run a simulation with user-defined functions (UDFs) on CONVERGE Horizon?

Yes. Follow the steps described here.

Can I run ParaView Catalyst on CONVERGE Horizon?

Yes. Complete the case setup for ParaView Catalyst, upload the input files, and create the job as usual. You do not need to configure the runtime environment for ParaView Catalyst. CONVERGE Horizon will automatically perform the required environment setup.

To learn more about setting up cases with ParaView Catalyst, refer to the ParaView documentation on the Convergent Science Hub (login required).

How do I run CONVERGE 2.3 or 2.4?

Select 2.3 or 2.4 as the CONVERGE version, upload a CONVERGE 2.3/2.4 executable to your input directory, and enter the name of the executable in the CONVERGE > Optional Executable Name field. The executable must be an Intel MPI build of CONVERGE 2.3 or 2.4, and you must run the simulation on a single node. Multi-node jobs are not supported for CONVERGE 2.x.

What happens if my organization runs out of credits on CONVERGE Horizon?

If your organization’s balance falls below zero, any jobs that are currently running will be stopped with a STOP file. Users will not be able to create new jobs until more credits are added.

Why do I see a Permission denied error when I try to SSH to a job/workstation/file system?

This could indicate that you have not added an SSH key to your account, or that none of the SSH keys in your account match the public key on your local device.

To compare an SSH key in your account to the key on your local device, first go to My Account > My SSH Keys to see the fingerprint of the key you added to CONVERGE Horizon. Then, open a terminal on your local device and run ssh-keygen -E md5 -lf $HOME/.ssh/<name>.pub, replacing <name> with the file name for your public key (typically id_ed25519 or id_rsa). The output of this command (after MD5:) should match the fingerprint displayed in CONVERGE Horizon.

When I try to extract compressed output files on Linux, I get an error that says zstd: Cannot exec: No such file or directory. What does this mean?

The Zstandard (zstd) package is not installed on your system. You must install zstd to extract the files.

When I try to transfer files with the CLI on Windows, I get an error that says Directory xxx does not exist for sink value of xxx. Why is this happening?

Some users encounter this error when transferring files with very long file paths. We recommend moving the files to a location with a shorter path and trying again. If you are unable to use a shorter file path, you can work with your IT team to increase the maximum path length on your system.

When I try to SSH to a job, I see an error message that says REMOTE HOST IDENTIFICATION HAS CHANGED!. What does this mean?

This means that you previously used SSH to connect to a node with the same IP address as the one you are currently connecting to, and the two nodes have different SSH keys. This is known to happen occasionally because cloud providers reuse IP addresses. To resolve this error, run ssh-keygen -R <IP>, replacing <IP> with the IP address from the error message. You should then be able to SSH to the job.

When I try to run horizon job:scp on Windows, I get an error that says unable to start ssh-agent service error :1058. What are the steps to resolve this?

This error occurs if the OpenSSH Authentication Agent (ssh-agent) service on your computer has the startup type set to Disabled. To run horizon job:scp, you need the startup type set to Manual or Automatic.

Users with administrative permissions can change this setting in Services > OpenSSH Authentication Agent or by running the following commands in PowerShell as an administrator:

# Check startup type
Get-Service ssh-agent | Select StartType

# Set startup type to Manual
Get-Service -Name ssh-agent | Set-Service -StartupType Manual