Parallel Constructs

Parallel Constructs Documentations

These function define the primative operations used by the physics modules to exchange data in parallel. When using these functions, the should not have to make any MPI calls directly, they should all be encapsulated within the provided functions.

TODO: crossref to physics module documentation

The Types and Basic API section describes the SharedFaceData datatype and the basic functions that operate on it. The Parallel Data Exchange section describes the functions used by the physics modules that start and finish parallel parallel communication.

Types and Basic API

This type holds all the data necessary to perform MPI communication with a given peer process that shared mesh edges (2D) or faces (3D) with the current process.

Fields:

peernum: the MPI rank of the peer process
peeridx: the index of this peer in mesh.peer_parts
myrank: MPI rank of the current process
comm: MPI communicator used to define the above

q_send: the send buffer, a 3D array of n x m x d.  While these dimensions
        are arbitrary, there are two commonly used case.  If
        opts["parallel_type"] == face, then m is mesh.numNodesPerFace and
        d is the number of faces shared with peernum.
        If opts["parallel_type"] == element, then 
        m = mesh.numNodesPerElement and d is the number of elements that
        share faces with peernum.
q_recv: the receive buffer.  Similar to q_send, except the size needs to
        to be the number of entities on the *remote* process.

send_waited: has someone called MPI.Wait() on send_req yet?  Some MPI
             implementations complain if Wait() is called on a Request
             more than once, so use this field to avoid doing so.
send_req: the MPI.Request object for the Send/Isend/whatever other type of
          Send
send_status: the MPI.Status object returned by calling Wait() on send_req

recv_waited: like send_waited, but for the receive
recv_req: like send_req, but for the receive
recv_status: like send_status, but for the receive

bndries_local: Vector of Boundaries describing the faces from the local
               side of the interface
bndries_remote: Vector of Boundaries describing the facaes from the remote
                side (see the documentation for PdePumiInterface before
                using this field)
interfaces: Vector of Interfaces describing the faces from both sides (see
            the documentation for PdePumiInterfaces, particularly the
            mesh.shared_interfaces field, before using this field
source

Outer constructor for SharedFaceData.

Inputs:

mesh: a mesh object
peeridx: the index of a peer in mesh.peer_parts
q_send: the send buffer
q_recv: the receive buffer
source

This function returns a vector of SharedFaceData objects, one for each peer processes the current process shares mesh edge (2D) or face (3D) with. This function is intended to be used by the AbstractSolutionData constructors, although it can be used to create additional vectors of SharedFaceData objects.

if opts["parallel_data"] == "face", then the send and receive buffers are numDofPerNode x numNodesPerFace x number of shared faces.

if opts["parallel_data"] == "element", the send and receive buffers are numDofPerNode x numNodesPerElement x number of elements that share the faces. Note that the number of elements that share the faces can be different for the send and receive buffers.

Inputs:

Tsol: element type of the arrays
mesh: an AbstractMesh object
sbp: an SBP operator
opts: the options dictonary

Outputs:

data_vec: Vector{SharedFaceData}.  data_vec[i] corresponds to 
          mesh.peer_parts[i]
source
Base.copy!Method.

In-place copy for SharedFaceData. This copies the buffers, but does not retain the state of the Request and Status fields. Instead they are initialized the same as the constructor.

This function may only be called after receiving is complete, otherwise an exception is thrown.

source
Base.copyMethod.

Copy function for SharedFaceData. Note that this does not retain the send_req/send_status (and similarly for the recceive) state of the original object. Instead, they are initialized the same as the constructor.

This function may only be called after receiving is complete, otherwise an exception is thrown.

source

Like assertSendsConsistent, but for the receives

source

This function verifies all the receives have been waited on for the supplied SharedFaceData objects

source

Verify either all or none of the sends have been waited on. Throw an exception otherwise.

Inputs:

shared_data: Vector of SharedFaceData objects

Output:

val: number of receives that have been waited on
source

This function is like MPI.Waitall, operating on the recvs of a vector of SharedFaceData objects

source
Utils.waitAllSendsMethod.

This function is like MPI.Waitall, operating on the sends of a vector of SharedFaceData objects

source

Like MPI.WaitAny, but operates on the receives of a vector of SharedFaceData. Only the index of the Request that was waited on is returned, the Status and recv_waited fields of hte SharedFaceData are updated internally

source

Parallel Data Exchange

The functions in this section are used to start sending data in parallel and finish receiving it. All functions operate on a Vector of SharedFaceData that define what data to send to which peer processes. See Parallel Overview for a high-level overview of how the code is parallelized.

Sending the data to the other processes is straight-forward. Receiving it (efficiently) is not. In particular, [finishExchangeData] waits to receive data from one peer process, calls a user supplied callback function to do calculations involving the received data, and then waits for the next receive to finish. This is significantly more efficient than waiting for all receives to finish and then doing computations on all the data.

This section describes the API the physics modules use to do parallel communication. The Internals section describes the helper functions used in the implementation.

This function is a thin wrapper around exchangeData(). It is used for the common case of sending and receiving the solution variables to other processes. It uses eqn.shared_data to do the parallel communication. eqn.shared_data must be passed into the corresponding finishDataExchange call.

Inputs: mesh: an AbstractMesh sbp: an SBP operator eqn: an AbstractSolutionData opts: options dictionary

Keyword arguments: tag: MPI tag to use for communication, defaults to TAG_DEFAULT wait: wait for sends and receives to finish before exiting

source
Utils.exchangeDataFunction.

This function posts the MPI sends and receives for a vector of SharedFaceData. It works for both opts["parallel_data"] == "face" or "element". The only difference between these two cases is the populate_buffer() function.

The previous receives using these SharedFaceData objects should have completed by the time this function is called. An exception is throw if this is not the case.

The previous sends are likely to have completed by the time this function is called, but they are waited on if not. This function might not perform well if the previous sends have not completed. #TODO: fix this using WaitAny

Inputs: mesh: an AbstractMesh sbp: an SBPOperator eqn: an AbstractSolutionData opts: the options dictionary populate_buffer: function with the signature: populate_buffer(mesh, sbp, eqn, opts, data::SharedFaceData) that populates data.q_send Inputs/Outputs: shared_data: vector of SharedFaceData objects representing the parallel communication to be done

Keyword Arguments: tag: MPI tag to use for this communication, defaults to TAG_DEFAULT This tag is typically used by the communication of the solution variables to other processes. Other users of this function should provide their own tag

wait: wait for the sends and receives to finish before returning.  This
      is a debugging option only.  It will kill parallel performance.
source

This is the counterpart of exchangeData. This function finishes the receives started in exchangeData.

This function (efficiently) waits for a receive to finish and calls a function to do calculations for on that data. If opts["parallel_data"] == "face", it also permutes the data in the receive buffers to agree with the ordering of elementL. For opts["parallel_data"] == "element", users should call SummationByParts.interiorFaceInterpolate to interpolate the data to the face while ensuring proper permutation.

Inputs: mesh: an AbstractMesh sbp: an SBPOperator eqn: an AbstractSolutionData opts: the options dictonary calc_func: function that does calculations for a set of shared faces described by a single SharedFaceData. It must have the signature calc_func(mesh, sbp, eqn, opts, data::SharedFaceData)

Inputs/Outputs: shared_data: vector of SharedFaceData, one for each peer process that needs to be communicated with. By the time calc_func is called, the SharedFaceData passed to it has its q_recv field populated. See note above about data permutation.

source
Utils.TAG_DEFAULTConstant.

Default MPI tag used for sending and receiving solution variables.

source

Internals

These helper functions are used by the functions in Parallel Data Exchange.

Utils.verifyCommunication

This function checks the data provided by the Status object to verify a communication completed successfully. The sender's rank and the number of elements is checked agains the expected sender and the buffer size

Inputs: data: a SharedFaceData

source
Utils.getSendDataFaceFunction.

This function populates the send buffer from eqn.q for opts["parallle_data"] == "face"

Inputs: mesh: a mesh sbp: an SBP operator eqn: an AbstractSolutionData opts: options dictonary

Inputs/Outputs: data: a SharedFaceData. data.q_send will be overwritten

source

This function populates the send buffer from eqn.q for opts["parallle_data"] == "element"

Inputs:

mesh: a mesh
sbp: an SBP operator
eqn: an AbstractSolutionData
opts: options dictonary

Inputs/Outputs:

data: a SharedFaceData.  data.q_send will be overwritten
source

Utils.mpi_master

This macro introduces an if statement that causes the expression to be executed only if the variable myrank is equal to zero. myrank must exist in the scope of the caller

source
Utils.@time_allMacro.

Utils.time_all

This macro returns the value produced by the expression as well as the execution time, the GC time, and the amount of memory allocated

source