Descripción basic del contenido
- Configuración
- Descripción basic
- Colocación del dispositivo de registro
- Colocación del dispositivo handbook
- Limitar el crecimiento de la memoria de GPU
- Uso de una sola GPU en un sistema multi-GPU
- Usando múltiples GPU
Código tensorflow, y tf.keras
Los modelos se ejecutarán transparentemente en una sola GPU sin se requieren cambios en el código.
Nota: Usar tf.config.list_physical_devices('GPU')
Para confirmar que TensorFlow está usando la GPU.
La forma más sencilla de ejecutarse en múltiples GPU, en una o muchas máquinas, es utilizar estrategias de distribución.
Esta guía es para los usuarios que han probado estos enfoques y descubrieron que necesitan un management de grano fino de cómo TensorFlow usa la GPU. Para aprender a depurar los problemas de rendimiento para escenarios de una sola y múltiples GPU, consulte la Guía de rendimiento de GPU de Optimize TensorFlow.
Configuración
Asegúrese de tener la última versión de GPU de TensorFlow instalada.
import tensorflow as tf
print("Num GPUs Accessible: ", len(tf.config.list_physical_devices('GPU')))
2024-08-15 02:53:40.344028: E exterior/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT manufacturing facility: Making an attempt to register manufacturing facility for plugin cuFFT when one has already been registered
2024-08-15 02:53:40.365851: E exterior/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN manufacturing facility: Making an attempt to register manufacturing facility for plugin cuDNN when one has already been registered
2024-08-15 02:53:40.372242: E exterior/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS manufacturing facility: Making an attempt to register manufacturing facility for plugin cuBLAS when one has already been registered
Num GPUs Accessible: 4
WARNING: All log messages earlier than absl::InitializeLog() is named are written to STDERR
I0000 00:00:1723690422.944962 162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at
I0000 00:00:1723690422.948934 162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at
I0000 00:00:1723690422.952655 162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at
I0000 00:00:1723690422.955880 162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at
I0000 00:00:1723690422.967120 162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at
I0000 00:00:1723690422.970596 162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at
I0000 00:00:1723690422.973980 162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at
I0000 00:00:1723690422.976984 162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at
I0000 00:00:1723690422.979869 162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at
I0000 00:00:1723690422.983344 162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at
I0000 00:00:1723690422.986754 162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at
I0000 00:00:1723690422.989690 162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at
Descripción basic
TensorFlow admite la ejecución de cálculos en una variedad de tipos de dispositivos, incluidas CPU y GPU. Están representados con identificadores de cadena, por ejemplo:
"/system:CPU:0"
: La CPU de tu máquina."/GPU:0"
: Notación de mano corta para la primera GPU de su máquina que es seen para TensorFlow."/job:localhost/reproduction:0/job:0/system:GPU:1"
: Nombre totalmente calificado de la segunda GPU de su máquina que es seen para TensorFlow.
Si una operación TensorFlow tiene implementaciones de CPU y GPU, de forma predeterminada, el dispositivo GPU se prioriza cuando se asigna la operación. Por ejemplo, tf.matmul
tiene núcleos de CPU y GPU y en un sistema con dispositivos CPU:0
y GPU:0
el GPU:0
El dispositivo se selecciona para ejecutar tf.matmul
A menos que solicite explícitamente ejecutarlo en otro dispositivo.
Si una operación TensorFlow no tiene una implementación de GPU correspondiente, la operación vuelve al dispositivo CPU. Por ejemplo, ya que tf.forged
Solo tiene un núcleo de CPU, en un sistema con dispositivos CPU:0
y GPU:0
el CPU:0
El dispositivo se selecciona para ejecutar tf.forged
incluso si se le solicita que se ejecute en el GPU:0
dispositivo.
Colocación del dispositivo de registro
Para averiguar a qué dispositivos se asignan sus operaciones y tensores. tf.debugging.set_log_device_placement(True)
Como la primera declaración de su programa. Habilitar el registro de colocación del dispositivo hace que se impriman las asignaciones u operaciones de tensor.
tf.debugging.set_log_device_placement(True)
# Create some tensors
a = tf.fixed([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
b = tf.fixed([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
c = tf.matmul(a, b)
print(c)
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:0
I0000 00:00:1723690424.215487 162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at
I0000 00:00:1723690424.217630 162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at
I0000 00:00:1723690424.219585 162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at
I0000 00:00:1723690424.221664 162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at
I0000 00:00:1723690424.223723 162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at
I0000 00:00:1723690424.225666 162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at
I0000 00:00:1723690424.227528 162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at
I0000 00:00:1723690424.229544 162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at
I0000 00:00:1723690424.231494 162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at
I0000 00:00:1723690424.233433 162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at
I0000 00:00:1723690424.235295 162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at
I0000 00:00:1723690424.237325 162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at
I0000 00:00:1723690424.276919 162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at
I0000 00:00:1723690424.278939 162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at
I0000 00:00:1723690424.280845 162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at
I0000 00:00:1723690424.282884 162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at
I0000 00:00:1723690424.284977 162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at
I0000 00:00:1723690424.286923 162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at
I0000 00:00:1723690424.288779 162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at
I0000 00:00:1723690424.290783 162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at
I0000 00:00:1723690424.292741 162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at
I0000 00:00:1723690424.295170 162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at
I0000 00:00:1723690424.297460 162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at
I0000 00:00:1723690424.299854 162671 cuda_executor.cc:1015] profitable NUMA node learn from SysFS had unfavourable worth (-1), however there should be at the very least one NUMA node, so returning NUMA node zero. See extra at
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op MatMul in system /job:localhost/reproduction:0/job:0/system:GPU:0
tf.Tensor(
[[22. 28.]
[49. 64.]], form=(2, 2), dtype=float32)
El código anterior imprimirá una indicación de la MatMul
OP fue ejecutado en GPU:0
.
Colocación del dispositivo handbook
Si desea que una operación en specific se ejecute en un dispositivo de su elección en lugar de lo que se selecciona automáticamente para usted, puede usar with tf.system
Para crear un contexto de dispositivo, y todas las operaciones dentro de ese contexto se ejecutarán en el mismo dispositivo designado.
tf.debugging.set_log_device_placement(True)
# Place tensors on the CPU
with tf.system('/CPU:0'):
a = tf.fixed([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
b = tf.fixed([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
# Run on the GPU
c = tf.matmul(a, b)
print(c)
Executing op MatMul in system /job:localhost/reproduction:0/job:0/system:GPU:0
tf.Tensor(
[[22. 28.]
[49. 64.]], form=(2, 2), dtype=float32)
Verás eso ahora a
y b
están asignados a CPU:0
. Dado que un dispositivo no se especificó explícitamente para el MatMul
Operación, el tiempo de ejecución de TensorFlow elegirá uno en función de la operación y los dispositivos disponibles (GPU:0
en este ejemplo) y copie automáticamente los tensores entre dispositivos si es necesario.
Limitar el crecimiento de la memoria de GPU
Por defecto, TensorFlow mapea casi toda la memoria de GPU de todas las GPU (sujeto a CUDA_VISIBLE_DEVICES
) seen para el proceso. Esto se hace para usar de manera más eficiente los recursos de memoria de GPU relativamente preciosos en los dispositivos reduciendo la fragmentación de la memoria. Para limitar el flujo de tensor a un conjunto específico de GPU, use el tf.config.set_visible_devices
método.
gpus = tf.config.list_physical_devices('GPU')
if gpus:
# Prohibit TensorFlow to solely use the primary GPU
strive:
tf.config.set_visible_devices(gpus[0], 'GPU')
logical_gpus = tf.config.list_logical_devices('GPU')
print(len(gpus), "Bodily GPUs,", len(logical_gpus), "Logical GPU")
besides RuntimeError as e:
# Seen gadgets should be set earlier than GPUs have been initialized
print(e)
Seen gadgets can't be modified after being initialized
En algunos casos, es deseable que el proceso solo asigne un subconjunto de la memoria disponible, o solo aumente el uso de la memoria como lo necesita el proceso. TensorFlow proporciona dos métodos para controlar esto.
La primera opción es activar el crecimiento de la memoria llamando tf.config.experimental.set_memory_growth
que intenta asignar solo tanta memoria GPU como sea necesario para las asignaciones de tiempo de ejecución: comienza a asignar muy poca memoria, y a medida que el programa se ejecuta y se necesita más memoria de GPU, la región de memoria de GPU se extiende para el proceso de flujo tensor. La memoria no se libera ya que puede conducir a la fragmentación de la memoria. Para activar el crecimiento de la memoria para una GPU específica, use el siguiente código antes de asignar tensores o ejecutar cualquier OPS.
gpus = tf.config.list_physical_devices('GPU')
if gpus:
strive:
# Presently, reminiscence development must be the identical throughout GPUs
for gpu in gpus:
tf.config.experimental.set_memory_growth(gpu, True)
logical_gpus = tf.config.list_logical_devices('GPU')
print(len(gpus), "Bodily GPUs,", len(logical_gpus), "Logical GPUs")
besides RuntimeError as e:
# Reminiscence development should be set earlier than GPUs have been initialized
print(e)
Bodily gadgets can't be modified after being initialized
Otra forma de habilitar esta opción es establecer la variable ambiental. TF_FORCE_GPU_ALLOW_GROWTH
a true
. Esta configuración es específica de la plataforma.
El segundo método es configurar un dispositivo GPU digital con tf.config.set_logical_device_configuration
y establezca un límite difícil en la memoria whole para asignar en la GPU.
gpus = tf.config.list_physical_devices('GPU')
if gpus:
# Prohibit TensorFlow to solely allocate 1GB of reminiscence on the primary GPU
strive:
tf.config.set_logical_device_configuration(
gpus[0],
[tf.config.LogicalDeviceConfiguration(memory_limit=1024)])
logical_gpus = tf.config.list_logical_devices('GPU')
print(len(gpus), "Bodily GPUs,", len(logical_gpus), "Logical GPUs")
besides RuntimeError as e:
# Digital gadgets should be set earlier than GPUs have been initialized
print(e)
Digital gadgets can't be modified after being initialized
Esto es útil si realmente desea limitar la cantidad de memoria GPU disponible para el proceso TensorFlow. Esta es una práctica común para el desarrollo native cuando la GPU se comparte con otras aplicaciones, como una GUI de estación de trabajo.
Uso de una sola GPU en un sistema multi-GPU
Si tiene más de una GPU en su sistema, la GPU con la ID más baja se seleccionará de forma predeterminada. Si desea ejecutar una GPU diferente, deberá especificar la preferencia explícitamente:
tf.debugging.set_log_device_placement(True)
strive:
# Specify an invalid GPU system
with tf.system('/system:GPU:2'):
a = tf.fixed([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
b = tf.fixed([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
c = tf.matmul(a, b)
besides RuntimeError as e:
print(e)
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op MatMul in system /job:localhost/reproduction:0/job:0/system:GPU:2
Si el dispositivo que ha especificado no existe, obtendrá un RuntimeError
: .../system:GPU:2 unknown system
.
Si desea que TensorFlow elija automáticamente un dispositivo existente y appropriate para ejecutar las operaciones en caso de que el especificado no exista, puede llamar tf.config.set_soft_device_placement(True)
.
tf.config.set_soft_device_placement(True)
tf.debugging.set_log_device_placement(True)
# Creates some tensors
a = tf.fixed([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
b = tf.fixed([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
c = tf.matmul(a, b)
print(c)
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op MatMul in system /job:localhost/reproduction:0/job:0/system:GPU:0
tf.Tensor(
[[22. 28.]
[49. 64.]], form=(2, 2), dtype=float32)
Usando múltiples GPU
Desarrollar para múltiples GPU permitirá que un modelo escala con los recursos adicionales. Si se desarrolla en un sistema con una sola GPU, puede simular múltiples GPU con dispositivos virtuales. Esto permite una prueba fácil de configuraciones de múltiples GPU sin requerir recursos adicionales.
gpus = tf.config.list_physical_devices('GPU')
if gpus:
# Create 2 digital GPUs with 1GB reminiscence every
strive:
tf.config.set_logical_device_configuration(
gpus[0],
[tf.config.LogicalDeviceConfiguration(memory_limit=1024),
tf.config.LogicalDeviceConfiguration(memory_limit=1024)])
logical_gpus = tf.config.list_logical_devices('GPU')
print(len(gpus), "Bodily GPU,", len(logical_gpus), "Logical GPUs")
besides RuntimeError as e:
# Digital gadgets should be set earlier than GPUs have been initialized
print(e)
Digital gadgets can't be modified after being initialized
Una vez que haya múltiples GPU lógicas disponibles para el tiempo de ejecución, puede utilizar las GPU múltiples con tf.distribute.Technique
o con colocación handbook.
Con tf.distribute.Technique
La mejor práctica para usar GPU múltiples es usar tf.distribute.Technique
. Aquí hay un ejemplo easy:
tf.debugging.set_log_device_placement(True)
gpus = tf.config.list_logical_devices('GPU')
technique = tf.distribute.MirroredStrategy(gpus)
with technique.scope():
inputs = tf.keras.layers.Enter(form=(1,))
predictions = tf.keras.layers.Dense(1)(inputs)
mannequin = tf.keras.fashions.Mannequin(inputs=inputs, outputs=predictions)
mannequin.compile(loss='mse',
optimizer=tf.keras.optimizers.SGD(learning_rate=0.2))
INFO:tensorflow:Utilizing MirroredStrategy with gadgets ('/job:localhost/reproduction:0/job:0/system:GPU:0', '/job:localhost/reproduction:0/job:0/system:GPU:1', '/job:localhost/reproduction:0/job:0/system:GPU:2', '/job:localhost/reproduction:0/job:0/system:GPU:3')
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op FloorMod in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Solid in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op StatelessRandomGetKeyCounter in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:0
enter: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
_EagerConst: (_EagerConst): /job:localhost/reproduction:0/job:0/system:GPU:0
output_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
a: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
b: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
MatMul: (MatMul): /job:localhost/reproduction:0/job:0/system:GPU:0
product_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
enter: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:2
_EagerConst: (_EagerConst): /job:localhost/reproduction:0/job:0/system:GPU:2
output_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:2
a: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:2
b: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:2
MatMul: (MatMul): /job:localhost/reproduction:0/job:0/system:GPU:2
product_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:2
resource_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
VarHandleOp: (VarHandleOp): /job:localhost/reproduction:0/job:0/system:GPU:0
useful resource: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
worth: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
AssignVariableOp: (AssignVariableOp): /job:localhost/reproduction:0/job:0/system:GPU:0
enter: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:1
_EagerConst: (_EagerConst): /job:localhost/reproduction:0/job:0/system:GPU:1
output_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:1
resource_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:1
VarHandleOp: (VarHandleOp): /job:localhost/reproduction:0/job:0/system:GPU:1
useful resource: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:1
worth: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:1
AssignVariableOp: (AssignVariableOp): /job:localhost/reproduction:0/job:0/system:GPU:1
resource_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:2
VarHandleOp: (VarHandleOp): /job:localhost/reproduction:0/job:0/system:GPU:2
useful resource: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:2
worth: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:2
AssignVariableOp: (AssignVariableOp): /job:localhost/reproduction:0/job:0/system:GPU:2
enter: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:3
_EagerConst: (_EagerConst): /job:localhost/reproduction:0/job:0/system:GPU:3
output_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:3
resource_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:3
VarHandleOp: (VarHandleOp): /job:localhost/reproduction:0/job:0/system:GPU:3
useful resource: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:3
worth: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:3
AssignVariableOp: (AssignVariableOp): /job:localhost/reproduction:0/job:0/system:GPU:3
enter: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
_EagerConst: (_EagerConst): /job:localhost/reproduction:0/job:0/system:GPU:0
output_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
x: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
y: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
FloorMod: (FloorMod): /job:localhost/reproduction:0/job:0/system:GPU:0
z_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
x: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
Solid: (Solid): /job:localhost/reproduction:0/job:0/system:GPU:0
y_RetVal: (_DeviceRetval): /job:localhost/reproduction:0/job:0/system:GPU:0
enter: (_Arg): /job:localhost/reproduction:0/job:0/system:CPU:0
_EagerConst: (_EagerConst): /job:localhost/reproduction:0/job:0/system:GPU:0
output_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
seed: (_Arg): /job:localhost/reproduction:0/job:0/system:CPU:0
StatelessRandomGetKeyCounter: (StatelessRandomGetKeyCounter): /job:localhost/reproduction:0/job:0/system:GPU:0
key_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
counter_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
form: (_DeviceArg): /job:localhost/reproduction:0/job:0/system:CPU:0
key: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
counter: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
alg: (_DeviceArg): /job:localhost/reproduction:0/taExecuting op StatelessRandomUniformV2 in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Sub in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Mul in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op AddV2 in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op ReadVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Identification in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op ReadVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Identification in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op ReadVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Identification in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Fill in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op ReadVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Identification in system /job:localhost/reproduction:0/job:0/system:GPU:1
sk:0/system:CPU:0
StatelessRandomUniformV2: (StatelessRandomUniformV2): /job:localhost/reproduction:0/job:0/system:GPU:0
output_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
x: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
y: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
Sub: (Sub): /job:localhost/reproduction:0/job:0/system:GPU:0
z_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
x: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
y: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
Mul: (Mul): /job:localhost/reproduction:0/job:0/system:GPU:0
z_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
x: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
y: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
AddV2: (AddV2): /job:localhost/reproduction:0/job:0/system:GPU:0
z_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
resource_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
VarHandleOp: (VarHandleOp): /job:localhost/reproduction:0/job:0/system:GPU:0
useful resource: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
worth: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
AssignVariableOp: (AssignVariableOp): /job:localhost/reproduction:0/job:0/system:GPU:0
useful resource: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
ReadVariableOp: (ReadVariableOp): /job:localhost/reproduction:0/job:0/system:GPU:0
value_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
enter: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:1
Identification: (Identification): /job:localhost/reproduction:0/job:0/system:GPU:1
output_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:1
resource_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:1
VarHandleOp: (VarHandleOp): /job:localhost/reproduction:0/job:0/system:GPU:1
useful resource: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:1
worth: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:1
AssignVariableOp: (AssignVariableOp): /job:localhost/reproduction:0/job:0/system:GPU:1
enter: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:2
Identification: (Identification): /job:localhost/reproduction:0/job:0/system:GPU:2
output_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:2
resource_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:2
VarHandleOp: (VarHandleOp): /job:localhost/reproduction:0/job:0/system:GPU:2
useful resource: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:2
worth: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:2
AssignVariableOp: (AssignVariableOp): /job:localhost/reproduction:0/job:0/system:GPU:2
enter: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:3
Identification: (Identification): /job:localhost/reproduction:0/job:0/system:GPU:3
output_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:3
resource_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:3
VarHandleOp: (VarHandleOp): /job:localhost/reproduction:0/job:0/system:GPU:3
useful resource: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:3
worth: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:3
AssignVariableOp: (AssignVariableOp): /job:localhost/reproduction:0/job:0/system:GPU:3
NoOp: (NoOp): /job:localhost/reproduction:0/job:0/system:GPU:0
dims: (_DeviceArg): /job:localhost/reproduction:0/job:0/system:CPU:0
worth: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
Fill: (Fill): /job:localhost/reproduction:0/job:0/system:GPU:0
output_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
resource_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
VarHandleOp: (VarHandleOp): /job:localhost/reproduction:0/job:0/system:GPU:0
useful resource: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
worth: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
AssignVariableOp: (AssignVariableOp): /job:localhost/reproduction:0/job:0/system:GPU:0
useful resource: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
ReadVariableOp: (ReadVariableOp): /job:localhost/reproduction:0/job:0/system:GPU:0
value_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
resource_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:1
VarHandleOp: (VarHandleOp): /job:lExecuting op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op ReadVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Identification in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op ReadVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Identification in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op ReadVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Identification in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op ReadVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Identification in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op ReadVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Identification in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op ReadVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Identification in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op ReadVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Identification in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op ReadVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Identification in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Fill in system /job:localhost/reproduction:0/job:0/system:GPU:0
ocalhost/reproduction:0/job:0/system:GPU:1
useful resource: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:1
worth: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:1
AssignVariableOp: (AssignVariableOp): /job:localhost/reproduction:0/job:0/system:GPU:1
resource_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:2
VarHandleOp: (VarHandleOp): /job:localhost/reproduction:0/job:0/system:GPU:2
useful resource: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:2
worth: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:2
AssignVariableOp: (AssignVariableOp): /job:localhost/reproduction:0/job:0/system:GPU:2
resource_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:3
VarHandleOp: (VarHandleOp): /job:localhost/reproduction:0/job:0/system:GPU:3
useful resource: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:3
worth: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:3
AssignVariableOp: (AssignVariableOp): /job:localhost/reproduction:0/job:0/system:GPU:3
resource_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
VarHandleOp: (VarHandleOp): /job:localhost/reproduction:0/job:0/system:GPU:0
useful resource: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
worth: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
AssignVariableOp: (AssignVariableOp): /job:localhost/reproduction:0/job:0/system:GPU:0
useful resource: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
ReadVariableOp: (ReadVariableOp): /job:localhost/reproduction:0/job:0/system:GPU:0
value_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
enter: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:1
Identification: (Identification): /job:localhost/reproduction:0/job:0/system:GPU:1
output_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:1
resource_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:1
VarHandleOp: (VarHandleOp): /job:localhost/reproduction:0/job:0/system:GPU:1
useful resource: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:1
worth: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:1
AssignVariableOp: (AssignVariableOp): /job:localhost/reproduction:0/job:0/system:GPU:1
enter: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:2
Identification: (Identification): /job:localhost/reproduction:0/job:0/system:GPU:2
output_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:2
resource_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:2
VarHandleOp: (VarHandleOp): /job:localhost/reproduction:0/job:0/system:GPU:2
useful resource: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:2
worth: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:2
AssignVariableOp: (AssignVariableOp): /job:localhost/reproduction:0/job:0/system:GPU:2
enter: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:3
Identification: (Identification): /job:localhost/reproduction:0/job:0/system:GPU:3
output_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:3
resource_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:3
VarHandleOp: (VarHandleOp): /job:localhost/reproduction:0/job:0/system:GPU:3
useful resource: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:3
worth: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:3
AssignVariableOp: (AssignVariableOp): /job:localhost/reproduction:0/job:0/system:GPU:3
resource_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
VarHandleOp: (VarHandleOp): /job:localhost/reproduction:0/job:0/system:GPU:0
useful resource: (_Arg): /job:localhost/reproduction:0/job:0/system:GPU:0
ReadVariableOp: (ReadVariableOp): /job:localhost/reproduction:0/job:0/system:GPU:0
value_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
resource_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:1
VarHandleOp: (VarHandleOp): /job:localhost/reproduction:0/job:0/system:GPU:1
resource_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:2
VarHandleOp: (VarHandleOp): /job:localhost/reproduction:0/job:0/system:GPU:2
resource_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:3
VarHandleOp: (VarHandleOp): /job:localhost/reproduction:0/job:0/system:GPU:3
resource_RetVal: (_Retval): /job:localhost/reproduction:0/job:0/system:GPU:0
VarHandleOp: (VarHandleOp): /job:localhost/reproduction:0/job:0/deviExecuting op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op ReadVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Identification in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op ReadVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Identification in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op ReadVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Identification in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Fill in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op ReadVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Identification in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op ReadVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Identification in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op ReadVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op Identification in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op VarHandleOp in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op AssignVariableOp in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op NoOp in system /job:localhost/reproduction:0/job:0/system:GPU:0
Este programa ejecutará una copia de su modelo en cada GPU, dividiendo los datos de entrada entre ellos, también conocidos como “paralelismo de datos”.
Para obtener más información sobre las estrategias de distribución, consulte la guía aquí.
Colocación handbook
tf.distribute.Technique
Funciona debajo del capó replicando el cálculo entre los dispositivos. Puede implementar manualmente la replicación construyendo su modelo en cada GPU. Por ejemplo:
tf.debugging.set_log_device_placement(True)
gpus = tf.config.list_logical_devices('GPU')
if gpus:
# Replicate your computation on a number of GPUs
c = []
for gpu in gpus:
with tf.system(gpu.identify):
a = tf.fixed([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
b = tf.fixed([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
c.append(tf.matmul(a, b))
with tf.system('/CPU:0'):
matmul_sum = tf.add_n(c)
print(matmul_sum)
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op MatMul in system /job:localhost/reproduction:0/job:0/system:GPU:0
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op MatMul in system /job:localhost/reproduction:0/job:0/system:GPU:1
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op MatMul in system /job:localhost/reproduction:0/job:0/system:GPU:2
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op _EagerConst in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op MatMul in system /job:localhost/reproduction:0/job:0/system:GPU:3
Executing op AddN in system /job:localhost/reproduction:0/job:0/system:CPU:0
tf.Tensor(
[[ 88. 112.]
[196. 256.]], form=(2, 2), dtype=float32)
Publicado originalmente en el