The magic of CluedIn
https://youtu.be/WrVayWXPUOo?si=XIUybRa5jS6w0Bmk
IPython, the command shell behind Jupyter notebooks, provides an awesome feature called magics. In short, you can skip writing Python code and use more like command line syntax.
This approach can simplify many repeating tasks, including work with CluedIn Python SDK. In this article, I want to introduce you to CluedIn Magic - the package that lets you work with CluedIn API with minimal code.
CluedIn Magic depends on CluedIn Python SDK, so you only need to install one package to get them both:
%pip install cluedin-magic
When working with products like Microsoft Fabric, Synapse Analytics, Databricks, etc., I usually pre-install packages in an environment so you don't have to run the above line.
Now, we can load CluedIn Magic by calling the %load_ext
magic:
%load_ext cluedin_magic
After this, you can call %cluedin
magic. If you do it without parameters or with wrong parameters, it will give you a brief help:
Available commands: get-context, search
Usage:
%cluedin get-context --jwt <jwt>
%cluedin search --context <context> --query <query> [--limit <limit>]
Get CluedIn context
When you work with CluedIn API, you need a context—a domain, organization name, email, password, or API token. What if I tell you that you just need the API token, and then CluedIn Magic will automagically resolve the rest? Let's try it!
At first, you only need an API token — you can get one from Administration -> API Tokens in CluedIn.
In the example below, I store it in an environment variable and then can get into a variable:
access_token = %env ACCESS_TOKEN
ctx = %cluedin get-context --jwt $access_token
Now, just give it to CluedIn Magic and it will give you a working CluedIn context:
ctx = %cluedin get-context --jwt eyJhbGci...5Odvpr1g
You can use this context now with CluedIn Python SDK or CluedIn Magic.
Search
Say you want to load all /Infrastructure/User
entities —
provide a context and a query, and get a pandas DataFrame with your data:
%cluedin search --context ctx --query +entityType:/Infrastructure/User
You can get a sample by providing a limit if you have millions of entities. In the next example, I get ten entities out of all entities in the system:
%cluedin search --context ctx --query * --limit 10
In the next example, I get ten records of type
/IMDb/Name
where imdb.name.birthYear
vocabulary key property does not equal \\N
:
%cluedin search --context ctx --query +entityType:/IMDb/Name -properties.imdb.name.birthYear:"\\\\N" --limit 10
You can also save the results in a variable and use it as a usual pandas DataFrame:
pd = %cluedin search --context ctx --query +entityType:/IMDb/Name +properties.imdb.name.birthYear:1981
pd.head()