Export Historical Log Data from Microsoft Sentinel (2024)

The need for very large security logs datasets to support complex security analytics and ML is ever-increasing. To facilitate this, security analysts and data scientists need to have the ability to easily export, transform and store data in a way that is flexible whilst being highly performant and scalable.

We have previously blogged about setting up continuously data exports directly from the Sentinel UI using the Sentinel data export tool. To augment this, we have created a new Sentinel notebook to provide an easy way to orchestrate the export, transformation and partitioning of historical data in your Azure Log Analytics workspace. Together, these provide a log data management solution for downstream analytics or for archival purposes that only requires a one-time setup.

Export Historical Log Data from Microsoft Sentinel (1)

Synapse Integration for Sentinel Notebooks

The new historical data export notebook uses Azure Synapse to work with data at scale. If you do not have the Synapse integration set up for your Sentinel notebooks, you can follow the steps here.

Export Historical Log Data from Microsoft Sentinel (2)

Continuous Log Export

The Data Export blade in Sentinel allows data to be continuously exported from tables in your Log Analytics workspace to an Azure storage account or Event Hub. You may wish to set up a continuous data export rule to eliminate the need to re-run manual data exports on a scheduled basis. The continuous export also allows Sentinel hunting notebooks using exported logs to utilize the latest data.

It is recommended to set up any continuous log export rule prior to performing a one-time export of historical logs to ensure that there is no gap in exported logs. It is also recommended that data is exported to Azure Data Lake Storage (ADLS) Gen2 to take advantage of the hierarchical namespace this provides (this will be important in a later step).

For a walkthrough on setting up new export rules, take a look at our previous blog, Configure a continuous data pipeline in Microsoft Sentinel for big data analytics.

Note: Log Analytics Data Export is currently free, but billing will start on July 1, 2022 – see the pricing page for details. Advance notice will be provided before billing starts.

Export Historical Log Data from Microsoft Sentinel (3)

Link ADLS Gen2 to Synapse Workspace

If the primary storage account for your Synapse workspace is not the account to which you want to export log data, you will need to create a new Azure Data Lake Storage Gen2 linked service by following the instructions here: Create an Azure Data Lake Storage Gen2 linked service using UI.

Launch the Notebook

This notebook can be launched straight from your Sentinel workspace by following the steps below.

  1. In the Sentinel portal, navigate to the Notebooks blade.
  2. Go to the Templates tab.
  3. Search for, and select, the “Export Historical Data” notebook.
  4. On the right panel, select Save notebook. You can rename the selected notebook or keep the default name and save it to an Azure ML workspace.
  5. The notebook is now accessible in your Azure ML workspace. From the same panel, select Launch notebook to open the notebook in Azure ML studio. (You may be prompted to log into the Azure ML workspace.)
  6. In the Azure ML workspace, notice that an Export Historical Data.ipynb file and a config.json file have been automatically generated from the previous step.
    The ipynb notebook file has the main content of the notebook whilst the config.json file stores configuration details about the Sentinel environment from which the notebook was launched.
  7. Select a compute instance for your notebook server. If you don’t have a compute instance, create one by following step 4 inLaunch a notebook using your Azure MLworkspace.

Export Historical Log Data from Microsoft Sentinel (4)

Configure Data to be Exported

The notebook contains detailed instructions on how to use it; it is designed to provide a step-by-step walkthrough on exporting any subset of data from your Log Analytics workspace.

First you will need to specify the subset of logs you wish to export. This can either be the name of a table or a specific KQL query. You may wish to run some exploratory queries in your log analytics workspace to determine which subset of columns or rows you wish to export.

Export Historical Log Data from Microsoft Sentinel (5)

(Currently, data can only be exported from one table at a time – this will be changed in future updates.)

Set Data Export Time Range

The next step is to set the time range from which you want to export data. This is done by specifying an end datetime and the number of days back before the end datetime to start the querying. If you have set up a continuous data export rule, you will want to set the end datetime to the time at which the continuous export was started (you can do this by checking the creation time of the export storage container).

Prior to running the data export, you can use the notebook to determine the size of data to be exported and the number of blobs that will be written, in order to accurately gauge costs associated with the data export.

Export Historical Log Data from Microsoft Sentinel (6)

The notebook uses batched, asynchronous calls to the Log Analytics REST API to retrieve data. Due to throttling and rate-limiting (see the Query API section in thedocs), you may need to adjust the default value of the query batch size – there are detailed notes in the notebook on how to set this value.

Note: This step may take some time to run depending on the volume of data being exported.

Export Historical Log Data from Microsoft Sentinel (7)

You may wish to run this cell with only a few days of data, initially, to ensure that the dataframe in the cell output contains the expected data (e.g., the expected set of columns and the expected number of rows).

Export Historical Log Data from Microsoft Sentinel (8)

Write Data to ADLS Gen2

Once the queries have run, the data can be persisted to Azure Data Lake Gen2 storage*. Fill in the details of your storage account in the notebook cell.

*Any Azure storage account can be used here, but the hierarchical namespace used by ADLS Gen2 makes moving and repartitioning log data in downstream tasks much more efficient.

Export Historical Log Data from Microsoft Sentinel (9)

You can view and rotate access keys for your storage account by navigating to the “Access Keys” blade in the Azure storage portal.

Note: The code shown above is for demo/testing purposes only! Keys should always be stored and retrieved securely (e.g. by using Azure Key Vault) and should never be stored as plaintext. Alternatively, you may wish to use another of Azure’s authentication flows (such as using SAS tokens)– see the docs for details.

Export Historical Log Data from Microsoft Sentinel (10)

Partition Data Using Spark

At this point, the historical log data has been successfully exported for custom archiving or for use in Sentinel notebooks or other downstream tasks.

However, you may wish to partition the data to allow for more performant data reads. The last section of the notebook repartitions the exported data by timestamp – this means splitting the data rows across multiple files in multiple directories with rows of data grouped by timestamp. We use a year/month/day/hour/five-minute-interval directory structure for partitions.

Export Historical Log Data from Microsoft Sentinel (11)

This provides two key benefits:

  • Matching the partition scheme used by continuously exported logs – this means that means that continuously exported data and historical log data can be read from in a unified way by any notebooks or data pipelines that consume this data
  • More performant data loading – by encoding the timestamp values in file paths, we can minimize the number of required file reads when loading data from a specific time range in downstream tasks

Using Spark via Azure Synapse

For a year's worth of historical log data, we may be writing files for over 100,000 separate partitions, so we rely on Spark's multi-executor parallelism to do this efficiently.

Export Historical Log Data from Microsoft Sentinel (12)

In order to run code on a Synapse Spark pool, we will need to specify the name of the linked Synapse workspace and Synapse Spark pool to use (see Pre-Requisites section, above).

Export Historical Log Data from Microsoft Sentinel (13)

Export Historical Log Data from Microsoft Sentinel (14)

Export Historical Log Data from Microsoft Sentinel (15)

Once we have started the Spark session, we can run the code in a notebook cell on the Spark pool by using the `%%synapse` cell magic at the start of the cell.

Note: If you encounter “UsageError: Line magic function `%synapse` not found”, ensure that you have run the notebook setup cells (at the top of the notebook) and that the “azureml-synapse” package was installed successfully.

Running through the last few cells of the notebook will write the historical logs to the same location as the continuously exported data, in the same format and with the same partition scheme.

Export Historical Log Data from Microsoft Sentinel (16)

We are now able to process, transform and analyze security log data at scale using Sentinel and Synapse notebooks! Get started by cloning one of our template guided hunting notebooks from the Templates tab under the Notebooks blade in Sentinel (also available on the Microsoft Sentinel Notebook GitHub).

An important part of being able to extract value from large volumes of log data is the ability to make it available for advanced analytics and ML in a flexible, performant and highly scalable manner.

Sentinel users can now leverage Synapse Spark pools to orchestrate the ETL of data in their Log Analytics workspace directly from a Sentinel notebook.

Next Steps

Get started with big data analytics using one of our template guided hunting notebooks in Sentinel (we also have guided hunting blogs), or write your own big data Sentinel+Synapse notebook using PySpark, MLlib and SynapseML.

For native low-cost log archival in Log Analytics workspaces, use the new archive policies feature. Archive policies can be configured for individual tables, and archived data can be easily searched or restored directly from your Log Analytics workspace.

For more customizable archiving, you can use this notebook in conjunction with archive-tier Azure storage.

Export Historical Log Data from Microsoft Sentinel (2024)
Top Articles
Investing in yourself: 25 quotes for powerful personal growth | Accrue Savings
Flirting with customers: a necessary tactic or a blatant violation?
Public Opinion Obituaries Chambersburg Pa
Uti Hvacr
The Ivy Los Angeles Dress Code
How To Be A Reseller: Heather Hooks Is Hooked On Pickin’ - Seeking Connection: Life Is Like A Crossword Puzzle
Jesus Calling December 1 2022
Okatee River Farms
Unraveling The Mystery: Does Breckie Hill Have A Boyfriend?
What is international trade and explain its types?
Miss America Voy Forum
10 Free Employee Handbook Templates in Word & ClickUp
How to find cash from balance sheet?
Mineral Wells Independent School District
Craigslist Blackshear Ga
Cambridge Assessor Database
2020 Military Pay Charts – Officer & Enlisted Pay Scales (3.1% Raise)
Sprinkler Lv2
No Hard Feelings - Stream: Jetzt Film online anschauen
Amih Stocktwits
The Largest Banks - ​​How to Transfer Money With Only Card Number and CVV (2024)
Laveen Modern Dentistry And Orthodontics Laveen Village Az
Panolian Batesville Ms Obituaries 2022
Zillow Group Stock Price | ZG Stock Quote, News, and History | Markets Insider
Play Tetris Mind Bender
Does Hunter Schafer Have A Dick
Colonial Executive Park - CRE Consultants
Dtm Urban Dictionary
Churchill Downs Racing Entries
Shelby Star Jail Log
Street Fighter 6 Nexus
Egg Crutch Glove Envelope
Petsmart Distribution Center Jobs
RUB MASSAGE AUSTIN
The Land Book 9 Release Date 2023
Synchrony Manage Account
Tal 3L Zeus Replacement Lid
Craigs List Jonesboro Ar
Wal-Mart 2516 Directory
Smith And Wesson Nra Instructor Discount
Miracle Shoes Ff6
Encompass.myisolved
Craigslist Freeport Illinois
Best GoMovies Alternatives
Vérificateur De Billet Loto-Québec
Bekkenpijn: oorzaken en symptomen van pijn in het bekken
Rise Meadville Reviews
Playboi Carti Heardle
Aurora Southeast Recreation Center And Fieldhouse Reviews
Mikayla Campinos Alive Or Dead
Tìm x , y , z :a, \(\frac{x+z+1}{x}=\frac{z+x+2}{y}=\frac{x+y-3}{z}=\)\(\frac{1}{x+y+z}\)b, 10x = 6y và \(2x^2\)\(-\) \(...
Ark Silica Pearls Gfi
Latest Posts
Article information

Author: Melvina Ondricka

Last Updated:

Views: 6421

Rating: 4.8 / 5 (48 voted)

Reviews: 95% of readers found this page helpful

Author information

Name: Melvina Ondricka

Birthday: 2000-12-23

Address: Suite 382 139 Shaniqua Locks, Paulaborough, UT 90498

Phone: +636383657021

Job: Dynamic Government Specialist

Hobby: Kite flying, Watching movies, Knitting, Model building, Reading, Wood carving, Paintball

Introduction: My name is Melvina Ondricka, I am a helpful, fancy, friendly, innocent, outstanding, courageous, thoughtful person who loves writing and wants to share my knowledge and understanding with you.