Skip to content

Fix MSSentinelSearch environment name and connection check in AzureSearchDriver#871

Open
Copilot wants to merge 14 commits intomainfrom
copilot/fix-azure-search-driver-connection
Open

Fix MSSentinelSearch environment name and connection check in AzureSearchDriver#871
Copilot wants to merge 14 commits intomainfrom
copilot/fix-azure-search-driver-connection

Conversation

Copy link
Contributor

Copilot AI commented Jan 21, 2026

AzureSearchDriver inherited EFFECTIVE_ENV="MSSentinel" from its parent AzureMonitorDriver, causing the QueryProvider to report the wrong environment name. Additionally, the inherited query() method checked for _query_client (used by parent) instead of _auth_header (used by this driver), causing "Workspace not connected" errors even after successful connection.

Changes

  • Override environment in __init__: Set EFFECTIVE_ENV to "MSSentinelSearch" after parent initialization
  • Override query filter: Include "MSSentinelSearch" in supported environments tuple
  • Extract connection check: Add _ensure_connected() helper to check _auth_header is not None
  • Override query method: Use _ensure_connected() instead of parent's _query_client check

Example

# Before: failed with "Workspace not connected" 
qry_prov = QueryProvider('MSSentinelSearch')
qry_prov.connect(workspace='BasicLogs')
print(qry_prov.environment)  # Showed "MSSentinel" (wrong)
df = qry_prov.exec_query('SyslogBasic_CL | take 1', start=start, end=end)  # Failed

# After: works correctly
qry_prov = QueryProvider('MSSentinelSearch')
qry_prov.connect(workspace='BasicLogs')  
print(qry_prov.environment)  # Shows "MSSentinelSearch" (correct)
df = qry_prov.exec_query('SyslogBasic_CL | take 1', start=start, end=end)  # Succeeds
Original prompt

This section details on the original issue you should resolve

<issue_title>[Bug]: New experimenal MSSentinelSearch data provider doesn't correctly use the AzureSearchDriver</issue_title>
<issue_description>Describe the bug

The MSSentinelSearch query provider / data environment seems to get confused between using the MSSentinel vs MSSentinelSearch data environments and fails to correctly connect the AzureSearchDriver.

To Reproduce

Steps to reproduce the behavior:

  1. Clone from this git repo and checkout main in order to test PR
    Ianhelle/az monitor search driver 2025 02 05 #825 included in main.
  2. Create an editable venv from the source and activate.
  3. Configure msticpyconfig.yaml with a 'Sentinel' workspace that inlcudes a basic table.
  4. Run a test script with the AzureSearchDriver on a table with the 'basic' plan.
import datetime

# Set debug logging
import logging
logging.basicConfig(level=logging.DEBUG)

# Inherit log level
import msticpy
print(f'msticpy version: {msticpy.__version__}')

# Config
msticpy.init_notebook()

# ws_config = msticpy.common.wsconfig.WorkspaceConfig(workspace="MyWorkspace")
# print(f'Workspace config: {ws_config}')
#qry_prov_basic_search = msticpy.QueryProvider(data_environment='MSSentinelSearch', ws_config=ws_config, workspace='BasicLogs')
qry_prov_basic_search = msticpy.QueryProvider('MSSentinelSearch')
qry_prov_basic_search.connect(workspace='BasicLogs')
print(f'Query provider driver: {qry_prov_basic_search.driver_class}')
print(f'Query provider environment: {qry_prov_basic_search.environment}')
print(f'Query provider connections: {qry_prov_basic_search.list_connections()}')

# Prep a small time range to limit basic logs query costs
lookback_period = datetime.timedelta(hours=1)
ingest_grace_period = datetime.timedelta(minutes=15)
end = datetime.datetime.now(datetime.timezone.utc) - ingest_grace_period
start = end - lookback_period
print(f'Start: {start}, End: {end}')

# Test query
df = qry_prov_basic_search.exec_query('SyslogBasic_CL | take 1', start=start, end=end)
print(df)

Expected behavior

AzureSearchDriver is connected and used with the corresponding MSSentinelSearch data environment.

Screenshots and/or Traceback

INFO:msticpy.data.drivers.azure_monitor_driver:AzureMonitorDriver loaded. connect_str  None, kwargs: {'data_environment': <DataEnvironment.MSSentinelSearch: 25>}
INFO:msticpy.data.core.data_providers:Using data environment MSSentinel
INFO:msticpy.data.core.data_providers:Driver class: AzureSearchDriver
...
INFO:msticpy.data.core.data_providers:Calling connect on driver
INFO:msticpy.data.drivers.azure_monitor_driver:WorkspaceConfig created from workspace name BasicLogs
...
INFO:msticpy.data.drivers.azure_monitor_driver:WorkspaceConfig created from workspace name BasicLogs
INFO:msticpy.data.drivers.azure_search_driver:Created HTTP-based query client using /search endpoint.
connected
INFO:msticpy.data.core.data_providers:Adding query pivot functions
Query provider driver: <class 'msticpy.data.drivers.azure_search_driver.AzureSearchDriver'>
Query provider environment: MSSentinel
Query provider connections: ['Default: BasicLogs']
Start: 2025-02-16 19:20:23.193644+00:00, End: 2025-02-16 20:20:23.193644+00:00
INFO:msticpy.data.core.query_provider_connections_mixin:Executing query 'SyslogBasic_CL | take 1...'
DEBUG:msticpy.data.core.query_provider_connections_mixin:Full query: SyslogBasic_CL | take 1
DEBUG:msticpy.data.core.query_provider_connections_mixin:Query options: {'start': datetime.datetime(2025, 2, 16, 19, 20, 23, 193644, tzinfo=datetime.timezone.utc), 'end': datetime.datetime(2025, 2, 16, 20, 20, 23, 193644, tzinfo=datetime.timezone.utc)}
Traceback (most recent call last):
  File "/home/***/SecurityNotebooks/demo/_bugs/msticpy/msticpy_cannot_query_azure_monitor_basic_logs/test_data_prov_ms_sentinel_search.py", line 39, in <module>
    df = qry_prov_basic_search.exec_query('SyslogBasic_CL | take 1', start=start, end=end)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/***/SecurityNotebooks/demo/_bugs/msticpy/msticpy_cannot_query_azure_monitor_basic_logs/src/msticpy/msticpy/data/core/query_provider_connections_mixin.py", line 99, in exec_query
    return self._query_provider.query(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/***/SecurityNotebooks/demo/_bugs/msticpy/msticpy_cannot_query_azure_monitor_basic_logs/src/msticpy/msticpy/data/drivers/azure_monitor_driver.py", line 294, in query
    raise MsticpyNotConnectedError(
msticpy.common.exceptions.MsticpyNotConnectedError: ('Workspace not connected.', 'Please run connect() to connect to the workspace', 'before running a query.', 'https://msticpy.readthedocs.io/en/latest/data_acquisition/DataProv-MSSentinel.html')

Environment (please complete the following information):

  • Python Version: 3.111
  • OS: Ubuntu
  • Python environment: editable venv from cloned msticpy source with main branch...

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI and others added 4 commits January 21, 2026 17:37
…d override query method

Co-authored-by: ianhelle <13070017+ianhelle@users.noreply.github.com>
…ation

Co-authored-by: ianhelle <13070017+ianhelle@users.noreply.github.com>
Co-authored-by: ianhelle <13070017+ianhelle@users.noreply.github.com>
Co-authored-by: ianhelle <13070017+ianhelle@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix MSSentinelSearch data provider connection issue with AzureSearchDriver Fix MSSentinelSearch environment and connection handling in AzureSearchDriver Jan 21, 2026
Copilot AI requested a review from ianhelle January 21, 2026 17:46
Copilot AI changed the title Fix MSSentinelSearch environment and connection handling in AzureSearchDriver Fix MSSentinelSearch environment name and connection check in AzureSearchDriver Jan 21, 2026
@ianhelle ianhelle marked this pull request as ready for review January 22, 2026 20:56
@ianhelle ianhelle enabled auto-merge January 30, 2026 18:30
- Add drop_duplicates(subset=['query']) before merge in get_whois_df to prevent
  row multiplication from duplicate whois results
- Change net_df fixture scope from module to function for test isolation with
  random sampling
- Add autouse fixture to clear LRU caches (get_whois_info, _whois_lookup) between
  tests to prevent state leakage
@ianhelle ianhelle requested a review from FlorianBracq February 2, 2026 16:38
@ianhelle
Copy link
Contributor

ianhelle commented Feb 2, 2026

Thx for the review @FlorianBracq -
These 3 PRs were all Co-pilot authored from issues - a bit of an experiment but not bad responses. However, I should have looked at them with a more critical eye than I did. I took your suggestions on this.

Copy link
Collaborator

@FlorianBracq FlorianBracq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, a few more changes would probably help IMO.
Feel free to correct me if you feel otherwise!

the underlying provider result if an error.

"""
if not self._connected or self._query_client is None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we would overload _ensure_connected (or the property connected) in this class (and AzureSearch) to ensure both conditions (self._connected being True and self._query_client/auth_header not being None) are validated?

title="Workspace not connected.",
help_uri=_HELP_URL,
)
self._ensure_connected("Azure Monitor")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove the hardcoded parameter as this would impact inheritance. Instead, we maybe want to add a new class variable that would contain the "display name" of the driver and use this?

self._connected = True
logger.info("Created HTTP-based query client using /search endpoint.")

def query(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this the same implementation as in the parent class?

If the driver is not connected.

"""
if not self._connected:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read again the code for DriverBase, and I'm wondering why we are using _connected and not the property connected here?
It would be easier to overload (and probably to understand) the definition of the property connected in child class

)
# Check if authentication token is present

if "X-Redlock-Auth" not in self.client.headers:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should probably move in the definition of property connected for this driver?

self.set_driver_property(
DriverProps.EFFECTIVE_ENV, DataEnvironment.MSSentinelSearch.name
)
# Override query filter to include MSSentinelSearch
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/override/extend/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: New experimenal MSSentinelSearch data provider doesn't correctly use the AzureSearchDriver

3 participants