Home Ciencia y Tecnología Cómo usar GPU múltiples con TensorFlow (no se requieren cambios en el...

Cómo usar GPU múltiples con TensorFlow (no se requieren cambios en el código)

35
0

Descripción basic del contenido

  • Configuración
  • Descripción basic
  • Colocación del dispositivo de registro
  • Colocación del dispositivo handbook
  • Limitar el crecimiento de la memoria de GPU
  • Uso de una sola GPU en un sistema multi-GPU
  • Usando múltiples GPU

Código tensorflow, y tf.keras Los modelos se ejecutarán transparentemente en una sola GPU sin se requieren cambios en el código.

Nota: Usar tf.config.list_physical_devices('GPU') Para confirmar que TensorFlow está usando la GPU.

La forma más sencilla de ejecutarse en múltiples GPU, en una o muchas máquinas, es utilizar estrategias de distribución.

Esta guía es para los usuarios que han probado estos enfoques y descubrieron que necesitan un management de grano fino de cómo TensorFlow usa la GPU. Para aprender a depurar los problemas de rendimiento para escenarios de una sola y múltiples GPU, consulte la Guía de rendimiento de GPU de Optimize TensorFlow.

Configuración

Asegúrese de tener la última versión de GPU de TensorFlow instalada.

import tensorflow as tf
print("Num GPUs Accessible: ", len(tf.config.list_physical_devices('GPU')))
2024-08-15 02:53:40.344028: E exterior/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT manufacturing facility: Making an attempt to register manufacturing facility for plugin cuFFT when one has already been registered
2024-08-15 02:53:40.365851: E exterior/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN manufacturing facility: Making an attempt to register manufacturing facility for plugin cuDNN when one has already been registered
2024-08-15 02:53:40.372242: E exterior/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS manufacturing facility: Making an attempt to register manufacturing facility for plugin cuBLAS when one has already been registered
Num GPUs Accessible:  4
WARNING: All log messages earlier than absl::InitializeLog() is named are written to STDERR
I0000 00:00:1723690422.944962  162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at 
I0000 00:00:1723690422.948934  162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at 
I0000 00:00:1723690422.952655  162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at 
I0000 00:00:1723690422.955880  162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at 
I0000 00:00:1723690422.967120  162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at 
I0000 00:00:1723690422.970596  162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at 
I0000 00:00:1723690422.973980  162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at 
I0000 00:00:1723690422.976984  162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at 
I0000 00:00:1723690422.979869  162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at 
I0000 00:00:1723690422.983344  162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at 
I0000 00:00:1723690422.986754  162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at 
I0000 00:00:1723690422.989690  162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at 

Descripción basic

TensorFlow admite la ejecución de cálculos en una variedad de tipos de dispositivos, incluidas CPU y GPU. Están representados con identificadores de cadena, por ejemplo:

  • "/system:CPU:0": La CPU de tu máquina.
  • "/GPU:0": Notación de mano corta para la primera GPU de su máquina que es seen para TensorFlow.
  • "/job:localhost/reproduction:0/job:0/system:GPU:1": Nombre totalmente calificado de la segunda GPU de su máquina que es seen para TensorFlow.

Si una operación TensorFlow tiene implementaciones de CPU y GPU, de forma predeterminada, el dispositivo GPU se prioriza cuando se asigna la operación. Por ejemplo, tf.matmul tiene núcleos de CPU y GPU y en un sistema con dispositivos CPU:0 y GPU:0el GPU:0 El dispositivo se selecciona para ejecutar tf.matmul A menos que solicite explícitamente ejecutarlo en otro dispositivo.

Si una operación TensorFlow no tiene una implementación de GPU correspondiente, la operación vuelve al dispositivo CPU. Por ejemplo, ya que tf.forged Solo tiene un núcleo de CPU, en un sistema con dispositivos CPU:0 y GPU:0el CPU:0 El dispositivo se selecciona para ejecutar tf.forgedincluso si se le solicita que se ejecute en el GPU:0 dispositivo.

Colocación del dispositivo de registro

Para averiguar a qué dispositivos se asignan sus operaciones y tensores. tf.debugging.set_log_device_placement(True) Como la primera declaración de su programa. Habilitar el registro de colocación del dispositivo hace que se impriman las asignaciones u operaciones de tensor.

tf.debugging.set_log_device_placement(True)

# Create some tensors
a = tf.fixed([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
b = tf.fixed([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
c = tf.matmul(a, b)

print(c)
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:0
I0000 00:00:1723690424.215487  162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at 
I0000 00:00:1723690424.217630  162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at 
I0000 00:00:1723690424.219585  162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at 
I0000 00:00:1723690424.221664  162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at 
I0000 00:00:1723690424.223723  162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at 
I0000 00:00:1723690424.225666  162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at 
I0000 00:00:1723690424.227528  162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at 
I0000 00:00:1723690424.229544  162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at 
I0000 00:00:1723690424.231494  162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at 
I0000 00:00:1723690424.233433  162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at 
I0000 00:00:1723690424.235295  162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at 
I0000 00:00:1723690424.237325  162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at 
I0000 00:00:1723690424.276919  162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at 
I0000 00:00:1723690424.278939  162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at 
I0000 00:00:1723690424.280845  162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at 
I0000 00:00:1723690424.282884  162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at 
I0000 00:00:1723690424.284977  162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at 
I0000 00:00:1723690424.286923  162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at 
I0000 00:00:1723690424.288779  162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at 
I0000 00:00:1723690424.290783  162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at 
I0000 00:00:1723690424.292741  162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at 
I0000 00:00:1723690424.295170  162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at 
I0000 00:00:1723690424.297460  162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at 
I0000 00:00:1723690424.299854  162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at 
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op MatMul in system /job:localhost/reproduction:0/job:0/system:GPU:0
tf.Tensor(
[[22. 28.]
 [49. 64.]], form=(2, 2), dtype=float32)

El código anterior imprimirá una indicación de la MatMul OP fue ejecutado en GPU:0.

Colocación del dispositivo handbook

Si desea que una operación en specific se ejecute en un dispositivo de su elección en lugar de lo que se selecciona automáticamente para usted, puede usar with tf.system Para crear un contexto de dispositivo, y todas las operaciones dentro de ese contexto se ejecutarán en el mismo dispositivo designado.

tf.debugging.set_log_device_placement(True)

# Place tensors on the CPU
with tf.system('/CPU:0'):
  a = tf.fixed([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
  b = tf.fixed([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])

# Run on the GPU
c = tf.matmul(a, b)
print(c)
Executing op MatMul in system /job:localhost/reproduction:0/job:0/system:GPU:0
tf.Tensor(
[[22. 28.]
 [49. 64.]], form=(2, 2), dtype=float32)

Verás eso ahora a y b están asignados a CPU:0. Dado que un dispositivo no se especificó explícitamente para el MatMul Operación, el tiempo de ejecución de TensorFlow elegirá uno en función de la operación y los dispositivos disponibles (GPU:0 en este ejemplo) y copie automáticamente los tensores entre dispositivos si es necesario.

Limitar el crecimiento de la memoria de GPU

Por defecto, TensorFlow mapea casi toda la memoria de GPU de todas las GPU (sujeto a CUDA_VISIBLE_DEVICES) seen para el proceso. Esto se hace para usar de manera más eficiente los recursos de memoria de GPU relativamente preciosos en los dispositivos reduciendo la fragmentación de la memoria. Para limitar el flujo de tensor a un conjunto específico de GPU, use el tf.config.set_visible_devices método.

gpus = tf.config.list_physical_devices('GPU')
if gpus:
  # Prohibit TensorFlow to solely use the primary GPU
  strive:
    tf.config.set_visible_devices(gpus[0], 'GPU')
    logical_gpus = tf.config.list_logical_devices('GPU')
    print(len(gpus), "Bodily GPUs,", len(logical_gpus), "Logical GPU")
  besides RuntimeError as e:
    # Seen gadgets should be set earlier than GPUs have been initialized
    print(e)
Seen gadgets can't be modified after being initialized

En algunos casos, es deseable que el proceso solo asigne un subconjunto de la memoria disponible, o solo aumente el uso de la memoria como lo necesita el proceso. TensorFlow proporciona dos métodos para controlar esto.

La primera opción es activar el crecimiento de la memoria llamando tf.config.experimental.set_memory_growthque intenta asignar solo tanta memoria GPU como sea necesario para las asignaciones de tiempo de ejecución: comienza a asignar muy poca memoria, y a medida que el programa se ejecuta y se necesita más memoria de GPU, la región de memoria de GPU se extiende para el proceso de flujo tensor. La memoria no se libera ya que puede conducir a la fragmentación de la memoria. Para activar el crecimiento de la memoria para una GPU específica, use el siguiente código antes de asignar tensores o ejecutar cualquier OPS.

gpus = tf.config.list_physical_devices('GPU')
if gpus:
  strive:
    # Presently, reminiscence development must be the identical throughout GPUs
    for gpu in gpus:
      tf.config.experimental.set_memory_growth(gpu, True)
    logical_gpus = tf.config.list_logical_devices('GPU')
    print(len(gpus), "Bodily GPUs,", len(logical_gpus), "Logical GPUs")
  besides RuntimeError as e:
    # Reminiscence development should be set earlier than GPUs have been initialized
    print(e)
Bodily gadgets can't be modified after being initialized

Otra forma de habilitar esta opción es establecer la variable ambiental. TF_FORCE_GPU_ALLOW_GROWTH a true. Esta configuración es específica de la plataforma.

El segundo método es configurar un dispositivo GPU digital con tf.config.set_logical_device_configuration y establezca un límite difícil en la memoria whole para asignar en la GPU.

gpus = tf.config.list_physical_devices('GPU')
if gpus:
  # Prohibit TensorFlow to solely allocate 1GB of reminiscence on the primary GPU
  strive:
    tf.config.set_logical_device_configuration(
        gpus[0],
        [tf.config.LogicalDeviceConfiguration(memory_limit=1024)])
    logical_gpus = tf.config.list_logical_devices('GPU')
    print(len(gpus), "Bodily GPUs,", len(logical_gpus), "Logical GPUs")
  besides RuntimeError as e:
    # Digital gadgets should be set earlier than GPUs have been initialized
    print(e)
Digital gadgets can't be modified after being initialized

Esto es útil si realmente desea limitar la cantidad de memoria GPU disponible para el proceso TensorFlow. Esta es una práctica común para el desarrollo native cuando la GPU se comparte con otras aplicaciones, como una GUI de estación de trabajo.

Uso de una sola GPU en un sistema multi-GPU

Si tiene más de una GPU en su sistema, la GPU con la ID más baja se seleccionará de forma predeterminada. Si desea ejecutar una GPU diferente, deberá especificar la preferencia explícitamente:

tf.debugging.set_log_device_placement(True)

strive:
  # Specify an invalid GPU system
  with tf.system('/system:GPU:2'):
    a = tf.fixed([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
    b = tf.fixed([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
    c = tf.matmul(a, b)
besides RuntimeError as e:
  print(e)
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op MatMul in system /job:localhost/reproduction:0/job:0/system:GPU:2

Si el dispositivo que ha especificado no existe, obtendrá un RuntimeError: .../system:GPU:2 unknown system.

Si desea que TensorFlow elija automáticamente un dispositivo existente y appropriate para ejecutar las operaciones en caso de que el especificado no exista, puede llamar tf.config.set_soft_device_placement(True).

tf.config.set_soft_device_placement(True)
tf.debugging.set_log_device_placement(True)

# Creates some tensors
a = tf.fixed([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
b = tf.fixed([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
c = tf.matmul(a, b)

print(c)
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op MatMul in system /job:localhost/reproduction:0/job:0/system:GPU:0
tf.Tensor(
[[22. 28.]
 [49. 64.]], form=(2, 2), dtype=float32)

Usando múltiples GPU

Desarrollar para múltiples GPU permitirá que un modelo escala con los recursos adicionales. Si se desarrolla en un sistema con una sola GPU, puede simular múltiples GPU con dispositivos virtuales. Esto permite una prueba fácil de configuraciones de múltiples GPU sin requerir recursos adicionales.

gpus = tf.config.list_physical_devices('GPU')
if gpus:
  # Create 2 digital GPUs with 1GB reminiscence every
  strive:
    tf.config.set_logical_device_configuration(
        gpus[0],
        [tf.config.LogicalDeviceConfiguration(memory_limit=1024),
         tf.config.LogicalDeviceConfiguration(memory_limit=1024)])
    logical_gpus = tf.config.list_logical_devices('GPU')
    print(len(gpus), "Bodily GPU,", len(logical_gpus), "Logical GPUs")
  besides RuntimeError as e:
    # Digital gadgets should be set earlier than GPUs have been initialized
    print(e)
Digital gadgets can't be modified after being initialized

Una vez que haya múltiples GPU lógicas disponibles para el tiempo de ejecución, puede utilizar las GPU múltiples con tf.distribute.Technique o con colocación handbook.

Con tf.distribute.Technique

La mejor práctica para usar GPU múltiples es usar tf.distribute.Technique. Aquí hay un ejemplo easy:

tf.debugging.set_log_device_placement(True)
gpus = tf.config.list_logical_devices('GPU')
technique = tf.distribute.MirroredStrategy(gpus)
with technique.scope():
  inputs = tf.keras.layers.Enter(form=(1,))
  predictions = tf.keras.layers.Dense(1)(inputs)
  mannequin = tf.keras.fashions.Mannequin(inputs=inputs, outputs=predictions)
  mannequin.compile(loss='mse',
                optimizer=tf.keras.optimizers.SGD(learning_rate=0.2))
INFO:tensorflow:Utilizing MirroredStrategy with gadgets ('/job:localhost/reproduction:0/job:0/system:GPU:0', '/job:localhost/reproduction:0/job:0/system:GPU:1', '/job:localhost/reproduction:0/job:0/system:GPU:2', '/job:localhost/reproduction:0/job:0/system:GPU:3')
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op FloorMod in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Solid in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op StatelessRandomGetKeyCounter in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:0
enter: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
_EagerConst: (_EagerConst): /job:localhost/reproduction:0/job:0/system:GPU:0
output_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
a: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
b: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
MatMul: (MatMul): /job:localhost/reproduction:0/job:0/system:GPU:0
product_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
enter: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:2
_EagerConst: (_EagerConst): /job:localhost/reproduction:0/job:0/system:GPU:2
output_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:2
a: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:2
b: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:2
MatMul: (MatMul): /job:localhost/reproduction:0/job:0/system:GPU:2
product_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:2
resource_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
VarHandleOp: (VarHandleOp): /job:localhost/reproduction:0/job:0/system:GPU:0
useful resource: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
worth: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
AssignVariableOp: (AssignVariableOp): /job:localhost/reproduction:0/job:0/system:GPU:0
enter: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:1
_EagerConst: (_EagerConst): /job:localhost/reproduction:0/job:0/system:GPU:1
output_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:1
resource_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:1
VarHandleOp: (VarHandleOp): /job:localhost/reproduction:0/job:0/system:GPU:1
useful resource: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:1
worth: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:1
AssignVariableOp: (AssignVariableOp): /job:localhost/reproduction:0/job:0/system:GPU:1
resource_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:2
VarHandleOp: (VarHandleOp): /job:localhost/reproduction:0/job:0/system:GPU:2
useful resource: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:2
worth: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:2
AssignVariableOp: (AssignVariableOp): /job:localhost/reproduction:0/job:0/system:GPU:2
enter: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:3
_EagerConst: (_EagerConst): /job:localhost/reproduction:0/job:0/system:GPU:3
output_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:3
resource_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:3
VarHandleOp: (VarHandleOp): /job:localhost/reproduction:0/job:0/system:GPU:3
useful resource: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:3
worth: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:3
AssignVariableOp: (AssignVariableOp): /job:localhost/reproduction:0/job:0/system:GPU:3
enter: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
_EagerConst: (_EagerConst): /job:localhost/reproduction:0/job:0/system:GPU:0
output_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
x: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
y: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
FloorMod: (FloorMod): /job:localhost/reproduction:0/job:0/system:GPU:0
z_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
x: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
Solid: (Solid): /job:localhost/reproduction:0/job:0/system:GPU:0
y_RetVal: (_DeviceRetval): /job:localhost/reproduction:0/job:0/system:GPU:0
enter: (_Arg): /job:localhost/reproduction:0/job:0/system:CPU:0
_EagerConst: (_EagerConst): /job:localhost/reproduction:0/job:0/system:GPU:0
output_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
seed: (_Arg): /job:localhost/reproduction:0/job:0/system:CPU:0
StatelessRandomGetKeyCounter: (StatelessRandomGetKeyCounter): /job:localhost/reproduction:0/job:0/system:GPU:0
key_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
counter_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
form: (_DeviceArg): /job:localhost/reproduction:0/job:0/system:CPU:0
key: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
counter: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
alg: (_DeviceArg): /job:localhost/reproduction:0/taExecuting op StatelessRandomUniformV2 in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Sub in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Mul in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op AddV2 in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op ReadVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Identification in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op ReadVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Identification in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op ReadVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Identification in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Fill in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op ReadVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Identification in system /job:localhost/reproduction:0/job:0/system:GPU:1
sk:0/system:CPU:0
StatelessRandomUniformV2: (StatelessRandomUniformV2): /job:localhost/reproduction:0/job:0/system:GPU:0
output_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
x: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
y: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
Sub: (Sub): /job:localhost/reproduction:0/job:0/system:GPU:0
z_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
x: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
y: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
Mul: (Mul): /job:localhost/reproduction:0/job:0/system:GPU:0
z_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
x: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
y: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
AddV2: (AddV2): /job:localhost/reproduction:0/job:0/system:GPU:0
z_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
resource_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
VarHandleOp: (VarHandleOp): /job:localhost/reproduction:0/job:0/system:GPU:0
useful resource: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
worth: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
AssignVariableOp: (AssignVariableOp): /job:localhost/reproduction:0/job:0/system:GPU:0
useful resource: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
ReadVariableOp: (ReadVariableOp): /job:localhost/reproduction:0/job:0/system:GPU:0
value_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
enter: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:1
Identification: (Identification): /job:localhost/reproduction:0/job:0/system:GPU:1
output_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:1
resource_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:1
VarHandleOp: (VarHandleOp): /job:localhost/reproduction:0/job:0/system:GPU:1
useful resource: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:1
worth: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:1
AssignVariableOp: (AssignVariableOp): /job:localhost/reproduction:0/job:0/system:GPU:1
enter: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:2
Identification: (Identification): /job:localhost/reproduction:0/job:0/system:GPU:2
output_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:2
resource_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:2
VarHandleOp: (VarHandleOp): /job:localhost/reproduction:0/job:0/system:GPU:2
useful resource: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:2
worth: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:2
AssignVariableOp: (AssignVariableOp): /job:localhost/reproduction:0/job:0/system:GPU:2
enter: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:3
Identification: (Identification): /job:localhost/reproduction:0/job:0/system:GPU:3
output_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:3
resource_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:3
VarHandleOp: (VarHandleOp): /job:localhost/reproduction:0/job:0/system:GPU:3
useful resource: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:3
worth: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:3
AssignVariableOp: (AssignVariableOp): /job:localhost/reproduction:0/job:0/system:GPU:3
NoOp: (NoOp): /job:localhost/reproduction:0/job:0/system:GPU:0
dims: (_DeviceArg): /job:localhost/reproduction:0/job:0/system:CPU:0
worth: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
Fill: (Fill): /job:localhost/reproduction:0/job:0/system:GPU:0
output_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
resource_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
VarHandleOp: (VarHandleOp): /job:localhost/reproduction:0/job:0/system:GPU:0
useful resource: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
worth: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
AssignVariableOp: (AssignVariableOp): /job:localhost/reproduction:0/job:0/system:GPU:0
useful resource: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
ReadVariableOp: (ReadVariableOp): /job:localhost/reproduction:0/job:0/system:GPU:0
value_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
resource_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:1
VarHandleOp: (VarHandleOp): /job:lExecuting op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op ReadVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Identification in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op ReadVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Identification in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op ReadVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Identification in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op ReadVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Identification in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op ReadVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Identification in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op ReadVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Identification in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op ReadVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Identification in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op ReadVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Identification in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Fill in system /job:localhost/reproduction:0/job:0/system:GPU:0
ocalhost/reproduction:0/job:0/system:GPU:1
useful resource: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:1
worth: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:1
AssignVariableOp: (AssignVariableOp): /job:localhost/reproduction:0/job:0/system:GPU:1
resource_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:2
VarHandleOp: (VarHandleOp): /job:localhost/reproduction:0/job:0/system:GPU:2
useful resource: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:2
worth: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:2
AssignVariableOp: (AssignVariableOp): /job:localhost/reproduction:0/job:0/system:GPU:2
resource_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:3
VarHandleOp: (VarHandleOp): /job:localhost/reproduction:0/job:0/system:GPU:3
useful resource: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:3
worth: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:3
AssignVariableOp: (AssignVariableOp): /job:localhost/reproduction:0/job:0/system:GPU:3
resource_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
VarHandleOp: (VarHandleOp): /job:localhost/reproduction:0/job:0/system:GPU:0
useful resource: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
worth: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
AssignVariableOp: (AssignVariableOp): /job:localhost/reproduction:0/job:0/system:GPU:0
useful resource: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
ReadVariableOp: (ReadVariableOp): /job:localhost/reproduction:0/job:0/system:GPU:0
value_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
enter: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:1
Identification: (Identification): /job:localhost/reproduction:0/job:0/system:GPU:1
output_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:1
resource_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:1
VarHandleOp: (VarHandleOp): /job:localhost/reproduction:0/job:0/system:GPU:1
useful resource: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:1
worth: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:1
AssignVariableOp: (AssignVariableOp): /job:localhost/reproduction:0/job:0/system:GPU:1
enter: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:2
Identification: (Identification): /job:localhost/reproduction:0/job:0/system:GPU:2
output_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:2
resource_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:2
VarHandleOp: (VarHandleOp): /job:localhost/reproduction:0/job:0/system:GPU:2
useful resource: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:2
worth: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:2
AssignVariableOp: (AssignVariableOp): /job:localhost/reproduction:0/job:0/system:GPU:2
enter: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:3
Identification: (Identification): /job:localhost/reproduction:0/job:0/system:GPU:3
output_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:3
resource_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:3
VarHandleOp: (VarHandleOp): /job:localhost/reproduction:0/job:0/system:GPU:3
useful resource: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:3
worth: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:3
AssignVariableOp: (AssignVariableOp): /job:localhost/reproduction:0/job:0/system:GPU:3
resource_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
VarHandleOp: (VarHandleOp): /job:localhost/reproduction:0/job:0/system:GPU:0
useful resource: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
ReadVariableOp: (ReadVariableOp): /job:localhost/reproduction:0/job:0/system:GPU:0
value_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
resource_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:1
VarHandleOp: (VarHandleOp): /job:localhost/reproduction:0/job:0/system:GPU:1
resource_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:2
VarHandleOp: (VarHandleOp): /job:localhost/reproduction:0/job:0/system:GPU:2
resource_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:3
VarHandleOp: (VarHandleOp): /job:localhost/reproduction:0/job:0/system:GPU:3
resource_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
VarHandleOp: (VarHandleOp): /job:localhost/reproduction:0/job:0/deviExecuting op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op ReadVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Identification in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op ReadVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Identification in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op ReadVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Identification in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Fill in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op ReadVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Identification in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op ReadVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Identification in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op ReadVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Identification in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0

Este programa ejecutará una copia de su modelo en cada GPU, dividiendo los datos de entrada entre ellos, también conocidos como “paralelismo de datos”.

Para obtener más información sobre las estrategias de distribución, consulte la guía aquí.

Colocación handbook

tf.distribute.Technique Funciona debajo del capó replicando el cálculo entre los dispositivos. Puede implementar manualmente la replicación construyendo su modelo en cada GPU. Por ejemplo:

tf.debugging.set_log_device_placement(True)

gpus = tf.config.list_logical_devices('GPU')
if gpus:
  # Replicate your computation on a number of GPUs
  c = []
  for gpu in gpus:
    with tf.system(gpu.identify):
      a = tf.fixed([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
      b = tf.fixed([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
      c.append(tf.matmul(a, b))

  with tf.system('/CPU:0'):
    matmul_sum = tf.add_n(c)

  print(matmul_sum)
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op MatMul in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op MatMul in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op MatMul in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op MatMul in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op AddN in system /job:localhost/reproduction:0/job:0/system:CPU:0
tf.Tensor(
[[ 88. 112.]
 [196. 256.]], form=(2, 2), dtype=float32)

Publicado originalmente en el Flujo tensor Sitio net, este artículo aparece aquí en un nuevo titular y tiene licencia bajo CC por 4.0. Muestras de código compartidas bajo la licencia Apache 2.0.

fuente

LEAVE A REPLY

Please enter your comment!
Please enter your name here