Could you please share any existing performance data or insights regarding the following?
- CPU Performance (ARMv8-A):
◦ What is the expected decoding/encoding throughput (e.g., ms per megapixel) on typical ARM cores (e.g., Cortex-A76/A78/X1)?
◦ Are there specific compiler flags (e.g., NEON optimizations, -mcpu tuning) recommended for best performance?
◦ How does the floating-point precision (FP32 vs. FP16) impact performance vs. quality trade-offs on mobile chips?
- GPU Acceleration:
◦ Is there any planned support or existing implementation for GPU-accelerated gain map application (e.g., via OpenGL ES, Vulkan, or OpenCL)?
◦ Currently, does the library rely entirely on CPU for tone mapping and gain map blending?
- Memory & Latency:
◦ For a standard 12MP image, what is the typical end-to-end latency on a mid-range ARM SoC?
◦ Are there known memory bandwidth bottlenecks when applying gain maps on mobile devices?
Could you please share any existing performance data or insights regarding the following?
◦ What is the expected decoding/encoding throughput (e.g., ms per megapixel) on typical ARM cores (e.g., Cortex-A76/A78/X1)?
◦ Are there specific compiler flags (e.g., NEON optimizations, -mcpu tuning) recommended for best performance?
◦ How does the floating-point precision (FP32 vs. FP16) impact performance vs. quality trade-offs on mobile chips?
◦ Is there any planned support or existing implementation for GPU-accelerated gain map application (e.g., via OpenGL ES, Vulkan, or OpenCL)?
◦ Currently, does the library rely entirely on CPU for tone mapping and gain map blending?
◦ For a standard 12MP image, what is the typical end-to-end latency on a mid-range ARM SoC?
◦ Are there known memory bandwidth bottlenecks when applying gain maps on mobile devices?