4. Extract Use Cases - VL examples.ipynb
%reload_ext pidgin
Current¶
Working on the November progress which we plan to discuss next week.
This week we will review the work in VegaLite and the UDF research.
# Extracting Vega Lite Transformations
digraph {
rankdir=LR
subgraph cluster_js{label=JS VegaLite->JupyterComms->ExtractTransform->FrontEnd }
Ibis->DataFrame[label="Empty DataFrame for type information"]
DataFrame->Altair[label="Compose visualization"]
ExtractTransform->UpdatedVegaLite->{FrontEnd UpdatedIbis}
UpdatedIbis->Ibis
Altair->VegaLite
VegaLite
}
## Omnisci UDFs
* https://docs.google.com/document/d/1iqG9gxa-baolsqt5U7T9OLhqFdNj5Ar8PDbFmG6Gscg/edit
* https://github.com/omnisci/mapd-core/pull/290/files
!more "../../mapd-core/python/example1.py"
Extracting Vega Lite Transformations¶
In this notebook, I will show some use cases for extracting out the transformations in a Vega Lite spec and moving their computation to the MapD database from the client.
To demonstrate this functionality, we recreate the examples the Vega Lite team used to demonstrate this feature: https://vega.github.io/vega-lite-transforms2sql/
Install¶
First, we have to install the omnisci_renderer
package, as well as altair, ibis and the MapD client. We need to install a development branch of ibis until https://github.com/ibis-project/ibis/pull/1675 is released.
import altair as alt
import ibis
import omnisci_renderer
Carrier names¶
Let's recreate the first example, counting the carrier names.
First we connect to thte table using Ibis:
conn = ibis.mapd.connect(
host='metis.mapd.com', user='mapd', password='HyperInteractive',
port=443, database='mapd', protocol= 'https'
)
t = conn.table("flights_donotmodify")
Then we compose an Altair chart using an ibis expression.
c = alt.Chart(t[t.carrier_name]).mark_bar().encode(
x='carrier_name',
y='count()'
)
Finally, we enable rendering that extracts the aggregate expressions and adds them onto the Ibis expresion:
alt.data_transformers.enable('ibis')
alt.renderers.enable("extract-ibis")
c
Delay by Month¶
delay_by_month = alt.Chart(t[t.flight_dayofmonth, t.flight_month, t.depdelay]).mark_rect().encode(
x='flight_dayofmonth:O',
y='flight_month:O',
color='average(depdelay)'
)
delay_by_month
Debugging¶
We can use display_chart
to show some intermediate computatitons for the chart. It does this by enabling different Altair renderers and displaying the chart:
??omnisci_renderer.display_chart
omnisci_renderer.display_chart(c)
omnisci_renderer.display_chart(delay_by_month)