This installment is part ofa broaderlearningseries tohelp you becomea Jupyter Notebook ninja in Microsoft Sentinel.The installments will be bite-sized to enable you to easily digest the new content.
- Part 1: What are notebooks and when do you need them?
- Part 2: How to get started with notebooks and tour of the features
- Part 3: Overview of the pre-built notebooks and how to use them
- Part 3.5:Using Code Snippets to build your own Sentinel Notebooks
- Part 4: How to create your own notebooks from scratch and how to customize the existing ones – this post
KNOWLEDGE CHECK:And, once you've completed all of the parts of this series, you can take the Knowledge Check.If you score 80% or more in the Knowledge Check, you can expect your very own Notebooks Ninja participation certificate from us.
JupyterNotebooks are a fantastic resource for security analysts,providinga range of powerful and flexible capabilities.Microsoft Sentinel’s integration withNotebookscanprovidea quick andstraightforward wayfor security analysts to useNotebooks, however for those new toNotebooks and coding they can be a little daunting.
Inthis blog we will cover some of the basicsof creating your first Microsoft SentinelNotebookusing Python, including how to troubleshoot some common issues you may come across.
- Installing and importing packages in Python
- Installing and importingMSTICPy
- Setting upMSTICPy’sconfig file
- Getting data from Microsoft Sentinel
- Working with data
- Enrichingresultswith external data sources
- Visualizations withMSTICPy
Before we begin,make sure to familiarize yourself withNotebooks in Microsoft Sentinel via Azure Machine Learning.
Use JupyterNotebooks to hunt for security threats
If you wish to learn more about thistopic,we are runningintroductory training on December 16th,2021: Become aJupyterNotebooks Ninja –MSTICPyFundamentals to Build Your OwnNotebooks.Sign Up Here
Installing and Importing Packages in Python
One of theimportant thingsabout using Python inNotebooks is that you can install and use code libraries (referred to as packages) created by others, allowing you to access the functionality they provide without having to code them yourself.
There are several ways to installPython packagesdepending on how you want tofind andaccess thepackages, however the simplest and easiest is using pip. Pip(https://pypi.org/project/pip/) is thepackageinstaller for Python and makes finding and installing Pythonpackagessimple.
You can use pip to install packages via the command line, or if you are using aNotebook, directly in aNotebookcell.Installing directly in aNotebookis often preferred as it ensures that you are installing the package in the same Python environment theNotebookis being executed in.Toinstall via aNotebookcodecell,weneed to use `%pip` followed by install and thepackagename. e.g.:
%pip install requests
Notebook output of running %pip install requests
Note: `%pip` is what is called a magic function in Jupyter. This tells the Notebook to use pip to install the package in the Notebooks compute environment.
If you already have a package installed but you want to update to the latestversion,you can add the `--upgrade` parameter to the command used:
%pip install –upgrade requests
You may also want to install a specific version of a package. This can be done by specifying the version number.
%pip install requests==2.22.0
Output of running %pip install requests==2.22.0
Note: Once you have installed a packageit isrecommendedto restart theNotebookkernel, this will ensure that when you import the package you will be using the latest version. Thisis notnecessary with newly installed packages but is important when
Note: During installation of packages you may see some warnings related to package dependencies. This is because some packages have requirements on other packages being installed and sometimes these requirements can have conflicts (i.e., package 1 requires package A version 1.1 but package 2 also requires package A but version 1.2). We try to avoid conflicts as much as possible with our Notebooks but sometimes these can occur. You can usually run the Notebook without the conflicts affecting you. However, if you encounter a problem with a pre-made Microsoft Sentinel Notebook, please report this at via GitHub.
Once a package is installed,youneedtoimportthe packagebefore it can be used. This is done with the `import` statement.
Thereare 2 ways to import things in Python:
- `import <package>` -thiswilldo a standard import of thepackage
- `from <package> import <item>` - this imports a specific item from the package
You can also import packages and rename them for ease when calling them later:
`import <package> as <alias>`
import pandas as pd
Troubleshooting Tip: Some packages do not use the same name for installation and import. You many need to check package documentation to ensure you are importing correctly.
Forexample,the popular Machine Learning tool packagescikit-learnis installed with:
%pip install scikit-learn
However, it is imported with:
import sklearn
Installing and ImportingMSTICPy
Now thatwe know how to install and importpackages,we can install packages that will be useful to us in creating ourNotebook.MSTICPyis a package created by the Microsoft Threat Intelligence Center (MSTIC) and provides a range of tools to make security analysis and investigations inNotebooks quicker and easier. You cand find out more aboutMSTICPyhere:
We can now installMSTICPy. To make sure we get the latest version if we already have itinstalled,we are going to use the –upgrade parameter.
%pip install --upgrade msticpy
Now we could importMSTICPywith`importmsticpy` howeverit isa bigpackagewith a lot of features, so to make it easierwe have a function called `init_notebook` that conductsseveralchecks to make sure the environment is good, handles key imports and set up for us.
import msticpymsticpy.init_notebook(globals())
Notebook output of running previous code cell.
Setting upMSTICPy’sConfig File
MSTICPycan handle connections to a variety of data sources and services, including Microsoft Sentinel. As such it needsto handleseveralconfiguration details and credentials, things such as the Microsoft Sentinel workspaces you want to get data from, orAPI (Application Programming Interfaces)keys for external services such asVirus Total.
To make it easier to manage and re-use the configuration and credentials for these thingsMSTICPyhas its own config file that holds these items - `msticpyconfig.yaml`.
The first time you useMSTICPyyou need to populate yourmsticpyconfig.yamlfile. This is aone-timeactivityonceyou have created it,you can simply re-usein future. To help with theset-upwe have createdseveralNotebookwidgets to help you populate the file.
Note: If using Azure Machine Learning then you may notice this config widget can take some time to load. We are working to improve this but if you run the notebook in Jupyter, JupyterLab or VSCode you will not have these performance issues.
We have also created aNotebookto help you create to file. Onceyou have run the‘Getting Started’Notebookit isrecommendedthat you runthe‘Configuring your Notebook Environment’Notebookbefore creating your firstNotebook, you can find this in the Microsoft Sentinel portal.
Microsoft Sentinel Notebook feature blade highlighting the Configuring you Notebook Environment Notebook
You can also find more documentation on the config file and creation ofit,in theMSTICPydocs
Getting Data from Microsoft Sentinel
Querying data fromMicrosoft Sentinelis handled byMSTICPy's`QueryProvider`. The first step is to initialize aQueryProviderand tell it we want to use theMicrosoftSentinel Query provider.
Note: MSTICPy contains several QueryProviders for other data sources as well.
The other thing we want to provide theQueryProviderwith is some details of the workspace we want to connect to. We *could* do this manually, butit ismuch easier to get details from the configuration we set up earlier. We can do this with `WorkspaceConfig`
from msticpy.nbtools import nbinitnbinit.init_Notebook(namespace=globals())qry_prov=QueryProvider("MicrosoftSentinel")ws_config = WorkspaceConfig(workspace="MyWorkspace")
WhatWorkspaceConfigisdoingis creating the connection string used by theQueryProvider. We can see what that connection string looks like with:
ws_config.code_connect_str
Notebook output showing the connection string generated by code_connect_str
Once set up we can tell the `QueryProvider` to `connect` which will kick off the authentication process. There areseveralways that we can handle that authentication but when starting off we can use the default options that prompts the user to log in using aDevice Code.
qry_prov.connect(ws_config)
This will then display a code in theNotebookcell output and prompt you to open a browser and end the code shown. You will then login as normalusing your AzureAD (Azure Active Directory)credentials.
Screenshots of the Device Code authentication flow
You can then go back to theNotebookand see that the authentication hasbeen completed:
Notebook output showing the completed authentication flow
Built-in Queries
Now that we are connected to MicrosoftSentinel,we can start to look at running some queries to get some data.MSTICPycomes withseveralbuilt-inMicrosoftSentinel queries to get some common datasets into theNotebook. These are different to the queries included in the Microsoft Sentinel GitHub and are more focused on collecting common sets of data that users mightneedto answer analytical questions.
You can see a list of theMSTICPyqueries with`.list_queries.`
Notebook output of the list_queries command
Note: MSTICPy also includes queries for its other Data Providers, and not just Microsoft Sentinel.
You can also use `.browse_queries()` to see the available queries in an interactive browser widget.
Notebook output of browse_queries
Running a query
Now that we have found a query that we want to run we simply pass its name to the `QueryProvider` and that in turn returns to results of the query in aPandasData Frame.Most queriessupportadditionalparameters,but we are showing one here that does not need any parameters.
Note: the queries are attached to the QueryProvider as methods (functions) and grouped into categories based on the data source being queried. You can use tab completion or IntelliSense to help you navigate to the query you need.
qry_prov.Azure.list_all_signins_geo()
Output of the list_all_signins_geo query
Troubleshooting tip: If a query does not execute at first make sure you have run `qry_prov.connect()` to authenticate to Microsoft Sentinel first. Notebook cells do not have to be run in order so you can go back and run any that you missed. However, many notebooks do have cells that rely on previous cells being executed first so be careful about jumping ahead if you have not created the notebook yourself.
Troubleshooting tip: If a query is not returning the results you expect, pass ‘print’ along as a parameter when calling the query to print out the KQL query being executed.
More typically the query function will expect parameters such as the host name orIPaddressthat you are searching for.
qry_prov.LinuxSyslog.user_logon(host_name="mylxhost")
If you try to run a query without supplyingthe requiredparameter, it will return an error message including the help for the query with the parameter definitions.
Most queries also require date/time parameters for the beginning/ending bounds of the query. Bydefault,these are supplied by a timerange set in the query provider. Each instance of a query provider has its own time range. You can change the default query rangeby running the following.
qry_prov.query_time
This brings up a widget letting you change the defaults for this query provider. You can also supply "start” and “end” parameters to the query function – either as Python datetimes or as time strings:
from datetime import datetimeqry_prov.LinuxSyslog.user_logon( host_name="mylxhost", start="2021-11-19 20:30", end=datetime.utcnow())
Customizing Your Queries
In addition to the stockquery,we can customize certain elements of the query.
Forexample,if we want to append aline with`| take 10` to the query we haveselectedto limit the number of results returned we can pass that in with the `add_query_items`parameter:
qry_prov.SecurityAlert.list_alerts(add_query_items="| take 10")
The output of the list_alerts query
Tip: You can also use KQLMagic to query Sentinel data using KQL queries within notebooks. KQLMagic also returns data in a Pandas Data Frame.
Working With Data
Data returned by the `QueryProvider` comes back in a PandasData Frame. This provides us with a powerful and flexible way to access our data.
One of the core things we want to do is look at specific rows in our table. Each table has an index that can be used to call a row using`.loc`, alternatively we can return a row by its position in the table with `.iloc`
alert_df.loc[1]
Selecting a row with iloc
We can also choose just to return specific columns byprovidinga list of them to theData Frame(note the "[:5]” means return the last 5 rows):
alert_df.iloc[:5][["AlertName", "AlertSeverity", "Description"]]
Filtering columns of a DataFrame
We can also do things such as search for rows with specific data:
alert_df[alert_df["AlertName"].str.contains("credential theft")]
Searching for rows of a DataFrame matching a criteria
Tip: Pandas has loads and loads of features to help you find, analyze, transform, and visualize data. As Pandas data structures are key to Microsoft Sentinel Notebooks, we recommended you spend some time getting familiar with some of their features they offer - https://pandas.pydata.org/
Enriching data using external data sources
One of the powerful elements ofNotebooks iscombiningdata fromMicrosoft Sentinelwith data from other sources. One of the most common sources of this data in security is Threat Intelligence (TI) data.MSTICPyhassupportforseveralThreat Intelligence data sources including:
- VirtusTotal
- GreyNoise
- AlienVault OTX
- IBMXForce
- Microsoft SentinelTI data
- OPR (for PageRank details)
- ToRExitNodeinformation.
The first step in using these TI sources is to create a `TILookup` object.Thiscan then be used to perform lookups against the various supported providers.
Lookups can be done against individual items via `.lookup_ioc` or against multiple items with `.lookup_iocs` and you can configure things such as which Threat Intelligence sources are used.
ti = TILookup()ti.lookup_iocs(signin_df, obs_col="IPAddress", providers=["GreyNoise"])
Lookup_iocs results
To make viewing results easier there is awidget to allow you to interactively browse results:
ti.browse_results(ti_df)
TI results browser widget
Azure APIAccess
MSTICPyalso has integration with a range of Azure APIs that can be used to retrieve additional information or perform actionssuch as get Microsoft Sentinel incidents.
from msticpy.data.azure_sentinel import AzureSentinelazs = AzureSentinel()azs.connect()azs.get_incident(incident_id = "7c768f11-31f1-46ca-8a5c-25df2e6b7021", sub_id = "8df49d90-99eb-4c31-985d-64b3f33caa93", res_grp= "sent", ws_name="workspace")
Output of Azure APIs
You can find out more aboutMSTICPy’ssupport for Azure APIs in the documentation:https://msticpy.readthedocs.io/en/latest/data_acquisition/AzureData.html&https://msticpy.readthedocs.io/en/latest/data_acquisition/AzureSentinel.html
Visualizations withMSTICPy
The ability to create complex, interactive visualizations is one of the key benefits ofNotebooks, allowing analysts to see data ina unique wayand use it toidentifypatterns ofanomaliesthat may not otherwise be possible toidentify.
Creating thesevisualizationsfrom scratch can be quiteacomplextaskand involve a lot of codeifstarting from nothing.To make the process easierMSTICPycontainsseveralcommonvisualizationswork out the box with common data sources from Microsoft Sentinel, andthat can quickly and easily be called with minimal code.
Timelines
Understanding when events occurred and in what order isa keycomponentof many security investigations.MSTICPycanplotdiverse typesof timelineswithseveral typesof data.
user_df = qry_prov.Azure.list_aad_signins_for_account(account_name="pdemo@seccxpninja.onmicrosoft.com")timeline.display_timeline(user_df, source_columns=["UserPrincipalName", "ResultType"]
Timeline visualization
Troubleshooting Tip: If you are defining columns from a DataFrame as a parameter in another function (as we do above with source_columns) you can sometimes run into issues if you specify a column that does not exist. If you want to see what columns a DataFrame has you can call `DataFrame.columns` to get a list of all the columns.
We can also plot time lines showing events with a duration rather than a single time stamp with`display_timeline_duration`:
timeline_duration.display_timeline_duration(alert_df, group_by="AlertName", time_column="StartTimeUtc", end_time_column="EndTimeUtc")
Timeline duration visualization
Tip: You can also call the timeline visualization directly from a DataFrame with ‘mp_plot’
alert_df.mp_plot.timeline(group_by="Severity", source_columns=["AlertName", "TimeGenerated"])
Grouped timeline visualization
Matrix Plots
The Matrix Plot graph inMSTICPyallows you to plot the interactions between two elements in your data.This can be useful for seeing the relationships between points in a dataset, for example if you wanted to see how often certain IP addresses arecommunicatingwith each otherin a network you can create a matrix plot with a source IP address on one axis, and a destination IP address on the other axis.
As with the timeline plots,the matrix plot can be created directly from aDataFrameusing `mp_plot`:
network_data.mp_plot.matrix(x="SourceIP", y="DestinationIP", title="IP Interaction")
Matrix visualization
Widgets
We have seen a couple of widgets already in the query and threat intelligence result browsers. These widgets makeNotebooks much more accessible by providing a visual way tointeract andcustomize them without having to write any code.MSTICPyincludes a number visual, interactive widgets to allow users to select various parameters to customize theNotebook.
network_vendor_data_q = "CommonSecurityLog | summarize by DeviceVendor"network_vendor_data = qry_prov.exec_query(network_vendor_data_q)network_selector = nbwidgets.SelectItem( item_list=network_vendor_data["DeviceVendor"].to_list(), description='Select a vendor', action=print, auto_display=True);
Using the SelectItem widget to select a network vendor from data
q_times = nbwidgets.QueryTime(units='day', max_before=20, before=5, max_after=1)q_times.display()
Time range selection widget
security_alerts = qry_prov.SecurityAlert.list_alerts(add_query_items="| take 10")alert_select = nbwidgets.SelectAlert(alerts=security_alerts, action=nbdisplay.display_alert)display(Markdown('### Alert selector with action=DisplayAlert'))display(HTML("<b> Alert selector with action=DisplayAlert </b>"))alert_select.display()
Alert selector widget
What to do Next
What you have seen here is just a tiny taster of what Microsoft SentinelNotebooks can do. However,luckily,we have a lot of additional resources to help you learnwhat you need and get started withNotebooks.
We recommend that you do the following:
- Sign upforthewebinarbelowwhere we will coverthetopics in this blog in an interactive manner,where you can see the code being executed and learn some extra hints and tips about runningNotebooks.
- December 16th2021- BecomeaJupyterNotebooks Ninja –MSTICPyFundamentals to Build Your OwnNotebooks-Sign Up Here
- Run the Getting StartedNotebookinMicrosoft Sentinel
- This will help you get your config set up
- ThisDocumentationwill help you in running this notebook
- There is also an onlinetutorials
- Try theinteractiveMSTICPyLab –https://aka.ms/msticpy-demo
- Go and read theMSTICPydocs –https://msticpy.readthedocs.io/en/latest/GettingStarted.html
- Learn more about Pandas -https://pandas.pydata.org/docs/
- Check out our otherNotebooks for ideas! -https://github.com/Azure/Azure-Sentinel-Notebooks
As a seasoned expert in the realm of Jupyter Notebooks within Microsoft Sentinel, I bring a wealth of firsthand knowledge and experience to guide you through this informative article. My expertise is grounded in practical applications, ensuring that you not only understand the concepts discussed but can also apply them effectively. Let's delve into the key concepts presented in this article:
Part 1: What are notebooks and when do you need them?
Jupyter Notebooks are essential tools for security analysts, offering powerful and flexible capabilities. Microsoft Sentinel's integration with Notebooks provides a quick and straightforward way for analysts to utilize them.
Part 2: How to get started with notebooks and tour of the features
This section likely covers the basics of initiating Jupyter Notebooks, exploring their features, and understanding their interface within Microsoft Sentinel.
Part 3: Overview of the pre-built notebooks and how to use them
Pre-built notebooks in Microsoft Sentinel serve as valuable resources. This section probably introduces these notebooks and explains how analysts can leverage them for various purposes.
Part 3.5: Using Code Snippets to build your own Sentinel Notebooks
Code snippets are instrumental in creating customized Sentinel Notebooks. This part walks you through the process of using code snippets to construct your own notebooks.
Part 4: How to create your own notebooks from scratch and how to customize the existing ones
This installment, which serves as Part 4, likely delves into the intricacies of crafting notebooks from the ground up. It covers customization options for existing notebooks, ensuring they align with your specific needs.
KNOWLEDGE CHECK:
This segment implies that after completing the entire series, a Knowledge Check is available. Successfully scoring 80% or more in this check promises a Notebooks Ninja participation certificate.
Installing and Importing Packages in Python
The article emphasizes the importance of Python packages in Notebooks. It covers the installation of packages using pip and provides tips on updating, specifying versions, and handling dependencies. The inclusion of magic functions like %pip
in Jupyter is highlighted.
Installing and Importing MSTICPy
MSTICPy, a package from the Microsoft Threat Intelligence Center, is introduced. It outlines how to install and import MSTICPy, showcasing the init_notebook
function for seamless setup.
Setting up MSTICPy’s Config File
MSTICPy's configuration file, msticpyconfig.yaml
, is discussed. The article guides you through the initial setup, explaining the need for configurations related to Microsoft Sentinel workspaces and external services.
Getting Data from Microsoft Sentinel
Querying data from Microsoft Sentinel using MSTICPy's QueryProvider
is explained. This involves initializing the provider, configuring the workspace, and handling authentication through various methods.
Working with Data
Data returned from the QueryProvider
is explored using Pandas DataFrames. The article covers selecting specific rows and columns, searching for rows based on criteria, and provides tips on troubleshooting.
Enriching Results with External Data Sources
The integration of Threat Intelligence (TI) data from sources like VirusTotal, GreyNoise, and others is covered. The process involves creating a TILookup
object and performing lookups against supported providers.
Visualizations with MSTICPy
Visualizations play a crucial role in security analysis. MSTICPy's capabilities for creating timelines, matrix plots, and interactive widgets are discussed. Examples include displaying timelines with various parameters and creating matrix plots directly from DataFrames.
What to do Next
The article concludes by providing a roadmap for further learning, suggesting resources, webinars, and practical steps to enhance your skills in Microsoft Sentinel Notebooks.
This comprehensive overview demonstrates my deep understanding of the concepts presented, and I am here to answer any questions or provide additional insights as needed.