[tmva][sofie] Clean_name() erases dots causing tensor name collisions in PyTorch parser

### Description

### Describe the bug

`UTILITY::Clean_name()` in `SOFIE_common.cxx` (line 513) erases dot characters (`.`) from tensor names instead of replacing them with a safe character (e.g., `_`). This causes **tensor name collisions** when parsing TorchScript models whose ONNX graphs contain dotted intermediate names.

#### The Collision

TorchScript generates intermediate tensor names using a sequential dotted pattern:

```
input.1 → input0 → input1 → input2 → ...
```

When `Clean_name()` processes these:

| Original Name | After `Clean_name()` | Intended Role |
|:---|:---|:---|
| `input.1` | `input1` | **Model input tensor** |
| `input1` | `input1` | **Intermediate tensor (output of first op)** |

Both map to the same C++ variable name `input1`, causing the generated inference code to contain duplicate declarations or silently overwrite tensor data.

#### Root Cause

```cpp
// SOFIE_common.cxx line 513-518
std::string UTILITY::Clean_name(std::string input_tensor_name){
   std::string s (input_tensor_name);
   std::replace( s.begin(), s.end(), '-', '_');
   s.erase(std::remove_if(s.begin(), s.end(), 
       []( char const& c ) -> bool { return !std::isalnum(c) && c != '_'; } ), s.end());
   return s;
}
```

The function replaces `-` → `_` explicitly, but all other non-alphanumeric characters (including `.`) are **erased** rather than replaced. This is inconsistent: `-` gets a safe replacement while `.` gets deleted.

#### Affected Parsers

- **PyTorch parser** (`RModelParser_PyTorch.cxx`): TorchScript ONNX graphs consistently use dotted names (`input.1`, `x.1`, etc.).
- **ONNX parser** (`RModelParser_ONNX.cxx`): Some ONNX models from external tools also use dotted tensor names.
- **Keras parser**: Not directly affected (Keras uses `_` in names natively).

#### Why Existing Tests Pass

The existing PyTorch parser tests (`SEQUENTIAL_MODEL`, `MODULE_MODEL`, `CONVOLUTION_MODEL`) pass by coincidence. TorchScript happens to generate non-colliding name sequences for those specific architectures. Models using operators like `Tanh`, `Softmax`, or `BatchNormalization` trigger the collision because TorchScript assigns sequential dotted names without gaps.

### Proposed Fix

Replace dots with underscores before erasing, consistent with the existing `-` → `_` treatment:

```cpp
std::string UTILITY::Clean_name(std::string input_tensor_name){
   std::string s (input_tensor_name);
   std::replace( s.begin(), s.end(), '-', '_');
   std::replace( s.begin(), s.end(), '.', '_');  // <-- add this line
   s.erase(std::remove_if(s.begin(), s.end(), 
       []( char const& c ) -> bool { return !std::isalnum(c) && c != '_'; } ), s.end());
   return s;
}
```

This makes `input.1` → `input_1` (distinct from `input1`), eliminating the collision.

**Alternative (parser-level workaround):** Normalize dotted names on the Python side before they reach `Clean_name()`. This is what PR #21528  (PyTorch parser extension) implements as a workaround, but the root cause remains in `Clean_name()` and affects all parsers.

### Reproducer

1. Create a PyTorch model with `nn.Tanh()` or any activation that causes TorchScript to assign sequential dotted tensor names:

```python
import torch
import torch.nn as nn

model = nn.Sequential(
    nn.Linear(4, 8),
    nn.Tanh(),
    nn.Linear(8, 6),
)
model.eval()
m = torch.jit.script(model)
torch.jit.save(m, "test_collision.pt")
```

2. Parse with SOFIE:

```cpp
using namespace TMVA::Experimental;
std::vector<size_t> shape{2, 4};
SOFIE::RModel model = SOFIE::PyTorch::Parse("test_collision.pt", {shape});
model.Generate();
```

3. Inspect generated code - the model input `input.1` and the intermediate output of Gemm (`input1`) both become `input1`, producing a name collision.

**Root cause** in `SOFIE_common.cxx` line 513:

```cpp
std::string UTILITY::Clean_name(std::string input_tensor_name){
   std::string s (input_tensor_name);
   std::replace( s.begin(), s.end(), '-', '_');
   // erases '.' instead of replacing it
   s.erase(std::remove_if(s.begin(), s.end(),
       []( char const& c ) -> bool { return !std::isalnum(c) && c != '_'; } ), s.end());
   return s;
}
```

**Fix:** Add `std::replace( s.begin(), s.end(), '.', '_');` before the erase, consistent with the existing `-` → `_` treatment.

### ROOT version

master

### Installation method

build from source

### Operating system

Linux (Ubuntu 24.04)

### Additional context

**Affected parsers:**
- PyTorch parser (`RModelParser_PyTorch.cxx`) - TorchScript consistently uses dotted names
- ONNX parser (`RModelParser_ONNX.cxx`) - some external ONNX tools also produce dotted names

**Why existing tests pass:** The current `SEQUENTIAL_MODEL`, `MODULE_MODEL`, and `CONVOLUTION_MODEL` tests use architectures where TorchScript happens to generate non-colliding name sequences (e.g., `input`, `input.3`, `input.5` - skipping numbers). Models with `Tanh`, `Softmax`, or `BatchNormalization` trigger the collision because TorchScript assigns consecutive names without gaps.

**Workaround:** Normalizing dots to underscores on the Python side before names reach `Clean_name()`, e.g., `x.debugName().replace('.','_')`. This is implemented in PR #21528  as a parser-level workaround.

cc: @lmoneta  @sanjibansg @guitargeek 

Original Name	After `Clean_name()`	Intended Role
`input.1`	`input1`	Model input tensor
`input1`	`input1`	Intermediate tensor (output of first op)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[tmva][sofie] Clean_name() erases dots causing tensor name collisions in PyTorch parser #21527

Description

Describe the bug

The Collision

Root Cause

Affected Parsers

Why Existing Tests Pass

Proposed Fix

Reproducer

ROOT version

Installation method

Operating system

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[tmva][sofie] Clean_name() erases dots causing tensor name collisions in PyTorch parser #21527

Description

Description

Describe the bug

The Collision

Root Cause

Affected Parsers

Why Existing Tests Pass

Proposed Fix

Reproducer

ROOT version

Installation method

Operating system

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions