Device Properties: Device Name Tesla S2050 GPU Code Name GF100 PCI Domain / Bus / Device 0 / 5 / 0 Clock Rate 1147 MHz Asynchronous Engines 2 Multiprocessors / Cores 14 / 448 L2 Cache 768 KB Max Threads Per Multiprocessor 1536 Max Threads Per Block 1024 Max Registers Per Block 32768 Max 32-bit Registers Per Multiprocessor 32768 Max Instructions Per Kernel 512 million Warp Size 32 threads Max Block Size 1024 x 1024 x 64 Max Grid Size 65535 x 65535 x 65535 Max 1D Texture Width 65536 Max 2D Texture Size 65536 x 65535 Max 3D Texture Size 2048 x 2048 x 2048 Max 1D Linear Texture Width 134217728 Max 2D Linear Texture Size 65000 x 65000 Max 2D Linear Texture Pitch 1048544 bytes Max 1D Layered Texture Width 16384 Max 1D Layered Texture Layers 2048 Max Mipmapped 1D Texture Width 16384 Max Mipmapped 2D Texture Size 16384 x 16384 Max Cubemap Texture Size 16384 x 16384 Max Cubemap Layered Texture Size 16384 x 16384 Max Cubemap Layered Texture Layers 2046 Max Texture Array Size 16384 x 16384 Max Texture Array Slices 2048 Max 1D Surface Width 65536 Max 2D Surface Size 65536 x 32768 Max 3D Surface Size 65536 x 32768 x 2048 Max 1D Layered Surface Width 65536 Max 1D Layered Surface Layers 2048 Max 2D Layered Surface Size 65536 x 32768 Max 2D Layered Surface Layers 2048 Compute Mode Default: Multiple contexts allowed per device Compute Capability 2.0 CUDA DLL nvcuda.dll (6.14.13.8205 - nVIDIA ForceWare 382.05) Memory Properties: Memory Clock 1546 MHz Global Memory Bus Width 384-bit Total Memory 2637 MB Total Constant Memory 64 KB Max Shared Memory Per Block 48 KB Max Shared Memory Per Multiprocessor 48 KB Max Memory Pitch 2147483647 bytes Texture Alignment 512 bytes Texture Pitch Alignment 32 bytes Surface Alignment 512 bytes Device Features: 32-bit Floating-Point Atomic Addition Supported 32-bit Integer Atomic Operations Supported 64-bit Integer Atomic Operations Supported Caching Globals in L1 Cache Supported Caching Locals in L1 Cache Supported Concurrent Kernel Execution Supported Concurrent Memory Copy & Execute Supported Double-Precision Floating-Point Supported ECC Enabled Funnel Shift Not Supported Half-Precision Floating-Point Not Supported Host Memory Mapping Supported Integrated Device No Managed Memory Not Supported Multi-GPU Board No Stream Priorities Not Supported Surface Functions Supported TCC Driver Yes Warp Vote Functions Supported __ballot() Supported __syncthreads_and() Supported __syncthreads_count() Supported __syncthreads_or() Supported __threadfence_system() Supported [ OpenCL: nVIDIA Tesla S2050 (GF100) ] OpenCL Properties: Platform Name NVIDIA CUDA Platform Vendor NVIDIA Corporation Platform Version OpenCL 1.2 CUDA 8.0.0 Platform Profile Full Device Properties: Device Name Tesla S2050 GPU Code Name GF100 Device Type GPU Device Vendor NVIDIA Corporation Device Version OpenCL 1.1 CUDA Device Profile Full Driver Version 382.05 OpenCL C Version OpenCL C 1.1 Clock Rate 1147 MHz Compute Units / Cores 14 / 448 Address Space Size 32-bit Max 2D Image Size 16384 x 16384 Max 3D Image Size 2048 x 2048 x 2048 Max Image Array Size 2048 Max Image Buffer Size 134217728 Max Samplers 16 Max Work-Item Size 1024 x 1024 x 64 Max Work-Group Size 1024 Max Argument Size 4352 bytes Max Constant Buffer Size 64 KB Max Constant Arguments 9 Max Printf Buffer Size 1 MB Native ISA Vector Widths char1, short1, int1, float1, double1 Preferred Native Vector Widths char1, short1, int1, long1, float1, double1 Profiling Timer Resolution 1000 ns CUDA Compute Capability 2.0 Max Registers Per Block 32768 Warp Size 32 threads Asynchronous Engines 2 PCI Bus / Device 5 / 0 OpenCL DLL opencl.dll (2.0.4.0) Memory Properties: Global Memory 2637 MB Global Memory Cache 224 KB (Read/Write, 128-byte line) Local Memory 48 KB Max Memory Object Allocation Size 675152 KB Memory Base Address Alignment 4096-bit Min Data Type Alignment 128 bytes OpenCL Compliancy: OpenCL 1.1 Yes (100%) OpenCL 1.2 Yes (100%) OpenCL 2.0 No (62%) Device Features: Command-Queue Out Of Order Execution Enabled Command-Queue Profiling Enabled Compiler Available Yes Error Correction Supported Images Supported Kernel Execution Supported Linker Available Yes Little-Endian Device Yes Native Kernel Execution Not Supported Sub-Group Independent Forward Progress Not Supported SVM Atomics Not Supported SVM Coarse Grain Buffer Not Supported SVM Fine Grain Buffer Supported SVM Fine Grain System Not Supported Thread Trace Not Supported Unified Memory No Half-Precision Floating-Point Capabilities: Correctly Rounded Divide and Sqrt Not Supported Denorms Not Supported IEEE754-2008 FMA Not Supported INF and NaNs Not Supported Rounding to Infinity Not Supported Rounding to Nearest Even Not Supported Rounding to Zero Not Supported Software Basic Floating-Point Operations No Single-Precision Floating-Point Capabilities: Correctly Rounded Divide and Sqrt Not Supported Denorms Supported IEEE754-2008 FMA Supported INF and NaNs Supported Rounding to Infinity Supported Rounding to Nearest Even Supported Rounding to Zero Supported Software Basic Floating-Point Operations No Double-Precision Floating-Point Capabilities: Correctly Rounded Divide and Sqrt Not Supported Denorms Supported IEEE754-2008 FMA Supported INF and NaNs Supported Rounding to Infinity Supported Rounding to Nearest Even Supported Rounding to Zero Supported Software Basic Floating-Point Operations No Device Extensions: Total / Supported Extensions 103 / 16 cl_altera_compiler_mode Not Supported cl_altera_device_temperature Not Supported cl_altera_live_object_tracking Not Supported cl_amd_bus_addressable_memory Not Supported cl_amd_c1x_atomics Not Supported cl_amd_compile_options Not Supported cl_amd_core_id Not Supported cl_amd_d3d10_interop Not Supported cl_amd_d3d9_interop Not Supported cl_amd_device_attribute_query Not Supported cl_amd_device_board_name Not Supported cl_amd_device_memory_flags Not Supported cl_amd_device_persistent_memory Not Supported cl_amd_device_profiling_timer_offset Not Supported cl_amd_device_topology Not Supported cl_amd_event_callback Not Supported cl_amd_fp64 Not Supported cl_amd_hsa Not Supported cl_amd_image2d_from_buffer_read_only Not Supported cl_amd_liquid_flash Not Supported cl_amd_media_ops Not Supported cl_amd_media_ops2 Not Supported cl_amd_offline_devices Not Supported cl_amd_popcnt Not Supported cl_amd_predefined_macros Not Supported cl_amd_printf Not Supported cl_amd_svm Not Supported cl_amd_vec3 Not Supported cl_apple_contextloggingfunctions Not Supported cl_apple_gl_sharing Not Supported cl_apple_setmemobjectdestructor Not Supported cl_arm_core_id Not Supported cl_arm_printf Not Supported cl_ext_atomic_counters_32 Not Supported cl_ext_atomic_counters_64 Not Supported cl_ext_device_fission Not Supported cl_ext_migrate_memobject Not Supported cl_intel_accelerator Not Supported cl_intel_advanced_motion_estimation Not Supported cl_intel_ctz Not Supported cl_intel_d3d11_nv12_media_sharing Not Supported cl_intel_device_partition_by_names Not Supported cl_intel_device_side_avc_motion_estimation Not Supported cl_intel_driver_diagnostics Not Supported cl_intel_dx9_media_sharing Not Supported cl_intel_exec_by_local_thread Not Supported cl_intel_media_block_io Not Supported cl_intel_motion_estimation Not Supported cl_intel_packed_yuv Not Supported cl_intel_planar_yuv Not Supported cl_intel_printf Not Supported cl_intel_required_subgroup_size Not Supported cl_intel_simultaneous_sharing Not Supported cl_intel_subgroups Not Supported cl_intel_subgroups_short Not Supported cl_intel_thread_local_exec Not Supported cl_intel_va_api_media_sharing Not Supported cl_intel_vec_len_hint Not Supported cl_intel_visual_analytics Not Supported cl_khr_3d_image_writes Not Supported cl_khr_byte_addressable_store Supported cl_khr_context_abort Not Supported cl_khr_d3d10_sharing Supported cl_khr_d3d11_sharing Not Supported cl_khr_depth_images Not Supported cl_khr_dx9_media_sharing Not Supported cl_khr_egl_event Not Supported cl_khr_egl_image Not Supported cl_khr_fp16 Not Supported cl_khr_fp64 Supported cl_khr_gl_depth_images Not Supported cl_khr_gl_event Not Supported cl_khr_gl_msaa_sharing Not Supported cl_khr_gl_sharing Supported cl_khr_global_int32_base_atomics Supported cl_khr_global_int32_extended_atomics Supported cl_khr_icd Supported cl_khr_il_program Not Supported cl_khr_image2d_from_buffer Not Supported cl_khr_initialize_memory Not Supported cl_khr_int64_base_atomics Not Supported cl_khr_int64_extended_atomics Not Supported cl_khr_local_int32_base_atomics Supported cl_khr_local_int32_extended_atomics Supported cl_khr_mipmap_image Not Supported cl_khr_mipmap_image_writes Not Supported cl_khr_priority_hints Not Supported cl_khr_select_fprounding_mode Not Supported cl_khr_spir Not Supported cl_khr_srgb_image_writes Not Supported cl_khr_subgroups Not Supported cl_khr_terminate_context Not Supported cl_khr_throttle_hints Not Supported cl_nv_compiler_options Supported cl_nv_copy_opts Supported cl_nv_create_buffer Supported cl_nv_d3d10_sharing Supported cl_nv_d3d11_sharing Supported cl_nv_d3d9_sharing Not Supported cl_nv_device_attribute_query Supported cl_nv_pragma_unroll Supported cl_qcom_ext_host_ptr Not Supported cl_qcom_ion_host_ptr Not Supported CUDA-Z Report ============= Version: 0.10.251 64 bit http://cuda-z.sf.net/ OS Version: Windows x86 6.2.9200 Driver Version: 382.05 (TCC) Driver Dll Version: 8.0 (6.14.13.8205) Runtime Dll Version: 6.50 Core Information ---------------- Name: Tesla S2050 Compute Capability: 2.0 Clock Rate: 1147 MHz PCI Location: 0:5:0 Multiprocessors: 14 (448 Cores) Threads Per Multiproc.: 1536 Warp Size: 32 Regs Per Block: 32768 Threads Per Block: 1024 Threads Dimensions: 1024 x 1024 x 64 Grid Dimensions: 65535 x 65535 x 65535 Watchdog Enabled: No Integrated GPU: No Concurrent Kernels: Yes Compute Mode: Default Stream Priorities: No Memory Information ------------------ Total Global: 2637.31 MiB Bus Width: 384 bits Clock Rate: 1546 MHz Error Correction: Yes L2 Cache Size: 48 KiB Shared Per Block: 48 KiB Pitch: 2048 MiB Total Constant: 64 KiB Texture Alignment: 512 B Texture 1D Size: 65536 Texture 2D Size: 65536 x 65535 Texture 3D Size: 2048 x 2048 x 2048 GPU Overlap: Yes Map Host Memory: Yes Unified Addressing: Yes Async Engine: Yes, Bidirectional Performance Information ----------------------- Memory Copy Host Pinned to Device: 6168.5 MiB/s Host Pageable to Device: 5277.82 MiB/s Device to Host Pinned: 6216.19 MiB/s Device to Host Pageable: 5886.83 MiB/s Device to Device: 51.3764 GiB/s GPU Core Performance Single-precision Float: 1021.13 Gflop/s Double-precision Float: 372.736 Gflop/s 64-bit Integer: 126.519 Giop/s 32-bit Integer: 512.704 Giop/s 24-bit Integer: 498.266 Giop/s