The magic of CluedIn

https://youtu.be/WrVayWXPUOo?si=XIUybRa5jS6w0Bmk

IPython, the command shell behind Jupyter notebooks, provides an awesome feature called magics. In short, you can skip writing Python code and use more like command line syntax.

This approach can simplify many repeating tasks, including work with CluedIn Python SDK. In this article, I want to introduce you to CluedIn Magic - the package that lets you work with CluedIn API with minimal code.

CluedIn Magic depends on CluedIn Python SDK, so you only need to install one package to get them both:

%pip install cluedin-magic

When working with products like Microsoft Fabric, Synapse Analytics, Databricks, etc., I usually pre-install packages in an environment so you don't have to run the above line.

Now, we can load CluedIn Magic by calling the %load_ext magic:

%load_ext cluedin_magic

After this, you can call %cluedin magic. If you do it without parameters or with wrong parameters, it will give you a brief help:

Available commands: get-context, search
Usage:
%cluedin get-context --jwt <jwt>
%cluedin search --context <context> --query <query> [--limit <limit>]

Get CluedIn context

When you work with CluedIn API, you need a context—a domain, organization name, email, password, or API token. What if I tell you that you just need the API token, and then CluedIn Magic will automagically resolve the rest? Let's try it!

At first, you only need an API token — you can get one from Administration -> API Tokens in CluedIn.

In the example below, I store it in an environment variable and then can get into a variable:

access_token = %env ACCESS_TOKEN
ctx = %cluedin get-context --jwt $access_token

Now, just give it to CluedIn Magic and it will give you a working CluedIn context:

ctx = %cluedin get-context --jwt eyJhbGci...5Odvpr1g

You can use this context now with CluedIn Python SDK or CluedIn Magic.

Search

Say you want to load all /Infrastructure/User entities — provide a context and a query, and get a pandas DataFrame with your data:

%cluedin search --context ctx --query +entityType:/Infrastructure/User
notebook

You can get a sample by providing a limit if you have millions of entities. In the next example, I get ten entities out of all entities in the system:

%cluedin search --context ctx --query * --limit 10

In the next example, I get ten records of type /IMDb/Name where imdb.name.birthYear vocabulary key property does not equal \\N:

%cluedin search --context ctx --query +entityType:/IMDb/Name -properties.imdb.name.birthYear:"\\\\N" --limit 10

You can also save the results in a variable and use it as a usual pandas DataFrame:

pd = %cluedin search --context ctx --query +entityType:/IMDb/Name +properties.imdb.name.birthYear:1981
pd.head()