<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[AI in Healthcare]]></title><description><![CDATA[Discover the transformative power of AI in Healthcare. We'll explore the integration of AI, machine learning, and big data analytics, revolutionizing diagnostics, treatment, and patient care.]]></description><link>https://www.talby.com</link><image><url>https://substackcdn.com/image/fetch/$s_!zoEu!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc96cb54c-7e4b-4d25-92e2-21b0bc699690_256x256.png</url><title>AI in Healthcare</title><link>https://www.talby.com</link></image><generator>Substack</generator><lastBuildDate>Tue, 05 May 2026 21:59:32 GMT</lastBuildDate><atom:link href="https://www.talby.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[David Talby]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[aiinhealthcare@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[aiinhealthcare@substack.com]]></itunes:email><itunes:name><![CDATA[David Talby]]></itunes:name></itunes:owner><itunes:author><![CDATA[David Talby]]></itunes:author><googleplay:owner><![CDATA[aiinhealthcare@substack.com]]></googleplay:owner><googleplay:email><![CDATA[aiinhealthcare@substack.com]]></googleplay:email><googleplay:author><![CDATA[David Talby]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Building Responsible Language Models with the NLP Test Library]]></title><description><![CDATA[Automatically generate test cases, run tests, and augment training datasets with the open-source, easy-to-use, cross-library NLP Test package]]></description><link>https://www.talby.com/p/building-responsible-language-models</link><guid isPermaLink="false">https://www.talby.com/p/building-responsible-language-models</guid><dc:creator><![CDATA[David Talby]]></dc:creator><pubDate>Tue, 02 May 2023 15:57:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!0oC3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c3c8a9b-54b5-42b3-83b2-f0d1ea38b061_1456x1048.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0oC3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c3c8a9b-54b5-42b3-83b2-f0d1ea38b061_1456x1048.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0oC3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c3c8a9b-54b5-42b3-83b2-f0d1ea38b061_1456x1048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!0oC3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c3c8a9b-54b5-42b3-83b2-f0d1ea38b061_1456x1048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!0oC3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c3c8a9b-54b5-42b3-83b2-f0d1ea38b061_1456x1048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!0oC3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c3c8a9b-54b5-42b3-83b2-f0d1ea38b061_1456x1048.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0oC3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c3c8a9b-54b5-42b3-83b2-f0d1ea38b061_1456x1048.jpeg" width="1456" height="1048" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7c3c8a9b-54b5-42b3-83b2-f0d1ea38b061_1456x1048.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1048,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:112619,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0oC3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c3c8a9b-54b5-42b3-83b2-f0d1ea38b061_1456x1048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!0oC3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c3c8a9b-54b5-42b3-83b2-f0d1ea38b061_1456x1048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!0oC3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c3c8a9b-54b5-42b3-83b2-f0d1ea38b061_1456x1048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!0oC3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c3c8a9b-54b5-42b3-83b2-f0d1ea38b061_1456x1048.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The <strong><a href="http://www.nlptest.org/">nlptest library</a></strong> is designed to help you do that by providing comprehensive testing capabilities for both models and data. It allows you to quickly generate, run, and customize tests to ensure your NLP systems are production-ready. With support for popular NLP libraries like transformers, Spark NLP, OpenAI, and spacy, nlptest is an extensible and flexible solution for any NLP project.</p><p>In this article, we&#8217;ll dive into three main tasks that the nlptest library helps you automate: Generating tests, running tests, and augmenting data.</p><p></p><h2><strong>Automatically Generate Tests</strong></h2><p>Unlike the testing libraries of the past, nlptest allows for the automatic generation of tests &#8211; to an extent. Each <code>TestFactory</code> can specify multiple test types and implement a test case generator and runner for each one.</p><p>The generated tests are presented as a table with &#8216;test case&#8217; and &#8216;expected result&#8217; columns that correspond to the specific test. These columns are designed to be easily understood by business analysts who can manually review, modify, add, or remove test cases as needed. For instance, consider the test cases generated by the <code>RobustnessTestFactory</code> for an NER task on the phrase &#8220;I live in Berlin.&#8221;:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jYDi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3d1d582-8252-411d-891b-a8ff022507ed_1026x278.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jYDi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3d1d582-8252-411d-891b-a8ff022507ed_1026x278.png 424w, https://substackcdn.com/image/fetch/$s_!jYDi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3d1d582-8252-411d-891b-a8ff022507ed_1026x278.png 848w, https://substackcdn.com/image/fetch/$s_!jYDi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3d1d582-8252-411d-891b-a8ff022507ed_1026x278.png 1272w, https://substackcdn.com/image/fetch/$s_!jYDi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3d1d582-8252-411d-891b-a8ff022507ed_1026x278.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jYDi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3d1d582-8252-411d-891b-a8ff022507ed_1026x278.png" width="1026" height="278" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a3d1d582-8252-411d-891b-a8ff022507ed_1026x278.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:278,&quot;width&quot;:1026,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:46109,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jYDi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3d1d582-8252-411d-891b-a8ff022507ed_1026x278.png 424w, https://substackcdn.com/image/fetch/$s_!jYDi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3d1d582-8252-411d-891b-a8ff022507ed_1026x278.png 848w, https://substackcdn.com/image/fetch/$s_!jYDi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3d1d582-8252-411d-891b-a8ff022507ed_1026x278.png 1272w, https://substackcdn.com/image/fetch/$s_!jYDi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3d1d582-8252-411d-891b-a8ff022507ed_1026x278.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Starting from the text &#8220;John Smith is responsible&#8221;, the <code>BiasTestFactory</code> has generated test cases for a text classification task using US ethnicity-based name replacement.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!M23t!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f02da65-4088-47c1-b54e-1f402ca085fe_1136x302.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!M23t!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f02da65-4088-47c1-b54e-1f402ca085fe_1136x302.png 424w, https://substackcdn.com/image/fetch/$s_!M23t!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f02da65-4088-47c1-b54e-1f402ca085fe_1136x302.png 848w, https://substackcdn.com/image/fetch/$s_!M23t!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f02da65-4088-47c1-b54e-1f402ca085fe_1136x302.png 1272w, https://substackcdn.com/image/fetch/$s_!M23t!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f02da65-4088-47c1-b54e-1f402ca085fe_1136x302.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!M23t!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f02da65-4088-47c1-b54e-1f402ca085fe_1136x302.png" width="1136" height="302" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8f02da65-4088-47c1-b54e-1f402ca085fe_1136x302.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:302,&quot;width&quot;:1136,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:59319,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!M23t!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f02da65-4088-47c1-b54e-1f402ca085fe_1136x302.png 424w, https://substackcdn.com/image/fetch/$s_!M23t!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f02da65-4088-47c1-b54e-1f402ca085fe_1136x302.png 848w, https://substackcdn.com/image/fetch/$s_!M23t!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f02da65-4088-47c1-b54e-1f402ca085fe_1136x302.png 1272w, https://substackcdn.com/image/fetch/$s_!M23t!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8f02da65-4088-47c1-b54e-1f402ca085fe_1136x302.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Generated by the <code>FairnessTestFactory</code> and <code>RepresentationTestFactory</code> classes, here are test cases that can ensure representation and fairness in the model&#8217;s evaluation. For instance, representation testing might require a test dataset with a minimum of 30 samples of male, female, and unspecified genders each. Meanwhile, fairness testing can set a minimum F1 score of 0.85 for the tested model when evaluated on data subsets with individuals from each of these gender categories.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!B-_g!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6abfe816-3558-4aec-b83e-068d71f8b5f2_1086x414.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!B-_g!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6abfe816-3558-4aec-b83e-068d71f8b5f2_1086x414.png 424w, https://substackcdn.com/image/fetch/$s_!B-_g!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6abfe816-3558-4aec-b83e-068d71f8b5f2_1086x414.png 848w, https://substackcdn.com/image/fetch/$s_!B-_g!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6abfe816-3558-4aec-b83e-068d71f8b5f2_1086x414.png 1272w, https://substackcdn.com/image/fetch/$s_!B-_g!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6abfe816-3558-4aec-b83e-068d71f8b5f2_1086x414.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!B-_g!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6abfe816-3558-4aec-b83e-068d71f8b5f2_1086x414.png" width="1086" height="414" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6abfe816-3558-4aec-b83e-068d71f8b5f2_1086x414.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:414,&quot;width&quot;:1086,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:61485,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!B-_g!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6abfe816-3558-4aec-b83e-068d71f8b5f2_1086x414.png 424w, https://substackcdn.com/image/fetch/$s_!B-_g!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6abfe816-3558-4aec-b83e-068d71f8b5f2_1086x414.png 848w, https://substackcdn.com/image/fetch/$s_!B-_g!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6abfe816-3558-4aec-b83e-068d71f8b5f2_1086x414.png 1272w, https://substackcdn.com/image/fetch/$s_!B-_g!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6abfe816-3558-4aec-b83e-068d71f8b5f2_1086x414.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The following are important points to take note of regarding test cases:</p><ul><li><p>Each test type has its interpretation of &#8220;test case&#8221; and &#8220;expected result,&#8221; which should be human-readable. After calling h.generate(), it is possible to manually review the list of generated test cases and determine which ones to keep or modify.</p></li><li><p>Given that the test table is a pandas data frame, it is editable within the notebook (with Qgrid) or exportable as a CSV file to allow business analysts to edit it in Excel.</p></li><li><p>While automation handles 80% of the work, manual checks are necessary. For instance, a fake news detector&#8217;s test case may show a mismatch between the expected and actual prediction if it replaces &#8220;Paris is the Capital of France&#8221; with &#8220;Paris is the Capital of Sudan&#8221; using a <code>replace_to_lower_income_country</code></p></li><li><p>Tests must align with business requirements, and one must validate this. For instance, the <code>FairnessTestFactory</code> does not test non-binary or other gender identities or mandate nearly equal accuracy across genders. However, the decisions made are clear, human-readable, and easy to modify.</p></li><li><p>Test types may produce only one test case or hundreds of them, depending on the configuration. Each TestFactory defines a set of parameters.</p></li><li><p>By design, TestFactory classes are usually task, language, locale, and domain-specific, enabling simpler and more modular test factories.</p></li></ul><h2><strong>Running Tests</strong></h2><p>To use the test cases that have been generated and edited, follow these steps:</p><ul><li><p>Execute <code>h.run()</code> to run all the tests. For each test case in the test harness&#8217;s table, the corresponding TestFactory will be called to execute the test and return a flag indicating whether the test passed or failed, along with a descriptive message.</p></li><li><p>After calling <code>h.run()</code>, call <code>h.report()</code>. This function will group the pass ratio by test type, display a summary table of the results, and return a flag indicating whether the model passed the entire test suite.</p></li><li><p>To store the test harness, including the test table, as a set of files, call <code>h.save()</code>. This will enable you to load and run the same test suite later, for example, when conducting a regression test.</p></li></ul><p>Below is the example of a report generated for a Named Entity Recognition (NER) model, applying tests from five test factories:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HVi4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94570994-b48e-47f6-b79d-9cf5185e4306_1110x370.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HVi4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94570994-b48e-47f6-b79d-9cf5185e4306_1110x370.png 424w, https://substackcdn.com/image/fetch/$s_!HVi4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94570994-b48e-47f6-b79d-9cf5185e4306_1110x370.png 848w, https://substackcdn.com/image/fetch/$s_!HVi4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94570994-b48e-47f6-b79d-9cf5185e4306_1110x370.png 1272w, https://substackcdn.com/image/fetch/$s_!HVi4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94570994-b48e-47f6-b79d-9cf5185e4306_1110x370.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HVi4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94570994-b48e-47f6-b79d-9cf5185e4306_1110x370.png" width="1110" height="370" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/94570994-b48e-47f6-b79d-9cf5185e4306_1110x370.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:370,&quot;width&quot;:1110,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:74596,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!HVi4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94570994-b48e-47f6-b79d-9cf5185e4306_1110x370.png 424w, https://substackcdn.com/image/fetch/$s_!HVi4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94570994-b48e-47f6-b79d-9cf5185e4306_1110x370.png 848w, https://substackcdn.com/image/fetch/$s_!HVi4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94570994-b48e-47f6-b79d-9cf5185e4306_1110x370.png 1272w, https://substackcdn.com/image/fetch/$s_!HVi4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F94570994-b48e-47f6-b79d-9cf5185e4306_1110x370.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>All the metrics calculated by nlptest, including the F1 score, bias score, and robustness score, are framed as tests with pass or fail outcomes. This approach requires you to specify the functionality of your application clearly, allowing for quicker and more confident model deployment. Furthermore, it enables you to share your test suite with regulators who can review or replicate your results.</p><p></p><h2><strong>Data Augmentation</strong></h2><p>A common approach to enhance the robustness or bias of your model is to include new training data that specifically targets these gaps. For instance, if the original dataset primarily consists of clean text without typos, slang, or grammatical errors, or doesn&#8217;t represent Muslim or Hindi names, adding such examples to the training dataset will help the model learn to handle them more effectively.</p><p>Generating examples automatically to improve the model&#8217;s performance is possible using the same method that is used to generate tests. Here is the workflow for data augmentation:</p><ol><li><p>To automatically generate augmented training data based on the results from your tests, call <code>h.augment()</code> after generating and running the tests. However, note that this dataset must be freshly generated, and the test suite cannot be used to retrain the model, as testing a model on data it was trained on would result in data leakage and artificially inflated test scores.</p></li><li><p>You can review and edit the freshly generated augmented dataset as needed, and then utilize it to retrain or fine-tune your original model. It is available as a pandas dataframe.</p></li><li><p>To evaluate the newly trained model on the same test suite it failed on before, create a new test harness and call <code>h.load()</code> followed by <code>h.run()</code> and <code>h.report()</code>.</p></li></ol><p>By following this iterative process, NLP data scientists are able to improve their models while ensuring compliance with their ethical standards, corporate guidelines, and regulatory requirements.</p><p></p><h2><strong>Getting Started</strong></h2><p>Visit <a href="https://nlptest.org/">nlptest.org</a> or run <code>pip install nlptest</code> to get started with the nlptest library, which is freely available. Additionally, nlptest is an early stage open-source community project you are welcome to join.</p><p>John Snow Labs has assigned a full development team to the project, and will continue to enhance the library for years, like our other open-source libraries. Regular releases with new test types, tasks, languages, and platforms are expected. However, contributing, sharing examples and documentation, or providing feedback will help you get what you need faster. Join the discussion on <a href="https://github.com/johnSnowLabs/nlptest">nlptest&#8217;s GitHub page</a>. Let&#8217;s work together to make safe, reliable, and responsible NLP a reality.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://aiinhealthcare.substack.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share AI in Healthcare&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://aiinhealthcare.substack.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share AI in Healthcare</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[3 Criteria for Regulatory-Grade Large Language Models]]></title><description><![CDATA[Regulatory Grade AI requires transparency, rigorous testing, and privacy protection for AI models in regulated industries. It ensures compliance, accuracy, and safety in decision-making.]]></description><link>https://www.talby.com/p/3-criteria-for-regulatory-grade-large</link><guid isPermaLink="false">https://www.talby.com/p/3-criteria-for-regulatory-grade-large</guid><dc:creator><![CDATA[David Talby]]></dc:creator><pubDate>Thu, 27 Apr 2023 05:33:00 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/8eea76c4-2de8-49d3-8f8b-2ed9ebb24906_1456x1048.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!uhG4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f4986bd-d3e6-4b39-9f15-afe354d889c4_1456x1048.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!uhG4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f4986bd-d3e6-4b39-9f15-afe354d889c4_1456x1048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!uhG4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f4986bd-d3e6-4b39-9f15-afe354d889c4_1456x1048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!uhG4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f4986bd-d3e6-4b39-9f15-afe354d889c4_1456x1048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!uhG4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f4986bd-d3e6-4b39-9f15-afe354d889c4_1456x1048.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!uhG4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f4986bd-d3e6-4b39-9f15-afe354d889c4_1456x1048.jpeg" width="1456" height="1048" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6f4986bd-d3e6-4b39-9f15-afe354d889c4_1456x1048.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1048,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:3044981,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!uhG4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f4986bd-d3e6-4b39-9f15-afe354d889c4_1456x1048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!uhG4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f4986bd-d3e6-4b39-9f15-afe354d889c4_1456x1048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!uhG4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f4986bd-d3e6-4b39-9f15-afe354d889c4_1456x1048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!uhG4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6f4986bd-d3e6-4b39-9f15-afe354d889c4_1456x1048.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Large language models (LLMs) have the potential to revolutionize decision-making and creative processes in many industries. Regarding regulated sectors such as healthcare and life sciences, certain issues, gaps, and limitations exist &#8211; and spur the need for a higher standard of AI, known as Regulatory Grade AI. This article aims to define three criteria that make an AI model "regulatory grade" suitable for use in highly regulated fields while ensuring the utmost level of compliance, accuracy, and safety.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.talby.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.talby.com/subscribe?"><span>Subscribe now</span></a></p><p></p><h2><strong>The No BS Principle</strong></h2><p>The first criterion is the "No Bullshit" principle. This simply means that LLMs should be designed in a way that prevents them from generating hallucinations or returning false information. Instead, they should be able to cite the source of any answer they provide.</p><p>This feature allows human experts to review the cited source and assess its reliability. For instance, a doctor may receive an answer regarding a clinical guideline from the AI model. If the model cites a study that involved fewer than 100 patients, the doctor can decide not to trust that specific paper, as it may not be sufficiently robust or representative. By providing a transparent trail of evidence, the "No BS" principle ensures that AI-generated information is held to the same standard as any other expert opinion.</p><p></p><h2><strong>Responsible AI</strong></h2><p>The second criterion for regulatory-grade AI is Applied Responsible AI. This means that AI models should undergo rigorous testing to ensure robustness, bias mitigation, fairness, toxicity reduction, accuracy, representation and prevention of data leakage.</p><p>These tests should be executable and presented in a human-readable format that can be easily shared with regulators. By demonstrating a commitment to responsible AI practices, organizations can reassure regulators, customers, and other stakeholders that their AI models are not only compliant but also adhere to the highest ethical and technical standards.</p><p></p><h2><strong>Privacy: No Sharing</strong></h2><p>The third criterion for regulatory-grade AI is the ability to run privately within an organization's firewall. This ensures that no proprietary or sensitive data is shared or transmitted outside the organization, maintaining security and confidentiality. Systems should be designed from the ground up to work seamlessly in high-compliance, air-gapped environments, protecting organizations from data breaches and other cyber threats.</p><p>By keeping data and processing in-house, organizations can maintain control over their information, which is essential for compliance with stringent regulations in fields such as healthcare and life sciences. Note that this criterion does not preclude running in a cloud environment - as long as you control the infrastructure and encryption keys and no one else ever sees your data.</p><p>Establishing high standards for AI models is crucial in today's rapidly evolving technological landscape. By developing and adopting regulatory-grade AI, organizations can ensure that their AI-driven decision-making processes are <strong>safe and effective</strong>. This will ultimately lead to better outcomes for patients, more efficient research, and increased trust in AI-powered solutions across regulated industries.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://aiinhealthcare.substack.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share AI in Healthcare&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://aiinhealthcare.substack.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share AI in Healthcare</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Medical Large Language Models Are Available Now And More Accurate Than General-Purpose LLM's]]></title><description><![CDATA[Large language models (LLM&#8217;s) unlock new use cases in healthcare NLP. John Snow Labs has launched new Healthcare NLP models for accurate and production-ready healthcare use cases.]]></description><link>https://www.talby.com/p/medical-large-language-models-from</link><guid isPermaLink="false">https://www.talby.com/p/medical-large-language-models-from</guid><dc:creator><![CDATA[David Talby]]></dc:creator><pubDate>Thu, 13 Apr 2023 05:38:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccbf1c5d-b05a-46c2-a871-c6d2db825c5d_1456x1048.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hFGQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f79f676-3f13-49cc-8cfa-6a6754222c28_595x380.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hFGQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f79f676-3f13-49cc-8cfa-6a6754222c28_595x380.png 424w, https://substackcdn.com/image/fetch/$s_!hFGQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f79f676-3f13-49cc-8cfa-6a6754222c28_595x380.png 848w, https://substackcdn.com/image/fetch/$s_!hFGQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f79f676-3f13-49cc-8cfa-6a6754222c28_595x380.png 1272w, https://substackcdn.com/image/fetch/$s_!hFGQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f79f676-3f13-49cc-8cfa-6a6754222c28_595x380.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hFGQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f79f676-3f13-49cc-8cfa-6a6754222c28_595x380.png" width="595" height="380" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5f79f676-3f13-49cc-8cfa-6a6754222c28_595x380.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:380,&quot;width&quot;:595,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:28619,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hFGQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f79f676-3f13-49cc-8cfa-6a6754222c28_595x380.png 424w, https://substackcdn.com/image/fetch/$s_!hFGQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f79f676-3f13-49cc-8cfa-6a6754222c28_595x380.png 848w, https://substackcdn.com/image/fetch/$s_!hFGQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f79f676-3f13-49cc-8cfa-6a6754222c28_595x380.png 1272w, https://substackcdn.com/image/fetch/$s_!hFGQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f79f676-3f13-49cc-8cfa-6a6754222c28_595x380.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>Large language models (LLM&#8217;s) unlock new use cases in healthcare NLP, so as part of our commitment to always keep you at the state of the art, the&nbsp;<a href="https://ccpq104.na1.hubspotlinks.com/Ctc/2H+113/ccpq104/VWvF1v8jkpM5W8B86v480wjxBW2sbCLr4ZkgSKN8NyyFp3lScmV1-WJV7CgTj4VpyFdh8ZGHlZN2K_yKzxjsgyW6RVCrk7hlgp9W8X-S5056X04vW5tzHz48QwpKvW19S1br2ZGBfbW3DCdWM6zVmm2W2djXG04cy3N4W57p21P5rGv90W2jt4wS5j0fBLW2YhslW7YhKf1W28F5SP7rhBgCW406bSw2HCZ2cW9lsBC81tLJ7rW96fSQD7f0CYcV-KsDv8crtTMW7bRKy34TFKS1W2RJ9_G8lkyktW4_2x5S7LydXTW1nB9lw1cCQDcVTf3p_5p0x6FN82FPD4kNQWzW20mBYp8bGq6GW8bM1Tm8g7QJNW4lkvkg1K2hHyVDKtLF7db_f332Ks1">latest 4.4 release</a>&nbsp;of John Snow Labs&#8217;&nbsp;<a href="https://ccpq104.na1.hubspotlinks.com/Ctc/2H+113/ccpq104/VWvF1v8jkpM5W8B86v480wjxBW2sbCLr4ZkgSKN8NyyDQ3lSbNV1-WJV7CgVBMW63219B7QD-_VW5FkpTR6Zj3_1W3hlVFf2cdRCtW5jkFLq5k1mpMW7hxYCb92yKDfW4psr2-4SFxwlW7qlnnr2yzN2gW5ByKqG8rD5gfW2Hf6RC4-FndTW2r7VJl6VQ-kbW6YQ28s8nGzGJTLkbJ7mZ0rKW9bMf9j1ptpLzW3N1nfR3jXBybW3nZGyn81s1b7W6zt2bz8Dd4V3W18s_PF27hzP9Vy4Pf34d-bKCW3-Qfkx6B3fjJW6_y4SS36w8PNW5d5M6g4yMqYbN8ypZl8nhcgR3nJH1">Healthcare NLP</a>&nbsp;includes a suite of new LLM&#8217;s that are healthcare specific, highly accurate, and production ready. Here&#8217;s what you need to know:</p><p>1.&nbsp;They cover a range of common healthcare use cases</p><ul><li><p><strong>Ask medical questions</strong>: Try asking the new&nbsp;BioGPT-JSL&nbsp;(the&nbsp;first ever closed-book medical question answering LLM based on BioGPT) &#8220;<em>how to treat asthma</em>&#8221;.</p></li><li><p><strong>Understand medical research</strong>: Give the&nbsp;MedicalQuestionAnswering&nbsp;annotator a PubMed abstract and ask it what the key results were.</p></li><li><p><strong>Generate clinical text</strong>: Prompt the&nbsp;MedicalTextGenerator&nbsp;annotator to complete &#8220;<em>66yo male patient presents with severe back pain and &#8230;</em>&#8221;.</p></li><li><p><strong>Summarize clinical encounters</strong>: Ask the&nbsp;MedicalSummarizer&nbsp;annotator to turn a visit summary, discharge note, radiology report, or pathology reports into one paragraph.</p></li><li><p><strong>Summarize questions from patients</strong>: With 5 models for 5 contexts,&nbsp;MedicalSummarizer&nbsp;can also turn an email or post from a patient into a one-sentence question.</p></li></ul><p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nONU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccbf1c5d-b05a-46c2-a871-c6d2db825c5d_1456x1048.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nONU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccbf1c5d-b05a-46c2-a871-c6d2db825c5d_1456x1048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!nONU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccbf1c5d-b05a-46c2-a871-c6d2db825c5d_1456x1048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!nONU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccbf1c5d-b05a-46c2-a871-c6d2db825c5d_1456x1048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!nONU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccbf1c5d-b05a-46c2-a871-c6d2db825c5d_1456x1048.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nONU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccbf1c5d-b05a-46c2-a871-c6d2db825c5d_1456x1048.jpeg" width="1456" height="1048" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ccbf1c5d-b05a-46c2-a871-c6d2db825c5d_1456x1048.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1048,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:101229,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nONU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccbf1c5d-b05a-46c2-a871-c6d2db825c5d_1456x1048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!nONU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccbf1c5d-b05a-46c2-a871-c6d2db825c5d_1456x1048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!nONU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccbf1c5d-b05a-46c2-a871-c6d2db825c5d_1456x1048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!nONU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccbf1c5d-b05a-46c2-a871-c6d2db825c5d_1456x1048.jpeg 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>2. They&#8217;re more accurate than general-purpose LLM&#8217;s.</p><ul><li><p>Clinical note summarization is&nbsp;<strong>30% more accurate&nbsp;</strong>than general state-of-the-art LLMs (BART, Flan-T5, Pegasus).</p></li><li><p>On clinical entity recognition, our models&nbsp;<strong>make half of the errors that ChatGPT does</strong>.</p></li><li><p>De-Identification out-of-the-box<strong>&nbsp;accuracy is&nbsp;93% compared to ChatGPT&#8217;s 60%&nbsp;on detecting PHI in clinical notes.</strong></p></li><li><p>Extracting ICD-10-CM codes is done with&nbsp;<strong>a 76% success rate versus 26% for GPT-3.5 and 36% for GPT-4.&nbsp;</strong></p></li></ul><p>It should come as no surprise that models trained with domain-specific data &amp; experts outperform general-purpose models. We&#8217;re happy to share the Python notebooks if you need to reproduce or customize the benchmarks.</p><p>3.. They&#8217;re production ready.</p><ul><li><p><strong>Runs on your infrastructure,&nbsp;</strong>behind your firewall, under your security controls. No text is ever sent to any third party or cloud service.</p></li><li><p><strong>No need to buy a shipload of GPU&#8217;s</strong>. We&#8217;ve engineered these LLM&#8217;s to run on commodity hardware, which makes them both much faster and much cheaper to scale.</p></li><li><p><strong>Regularly updated</strong>. LLM&#8217;s are regularly tuned as new research papers, clinical trials, guidelines and terminologies are published. Never go to production with a stale model.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.talby.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.talby.com/subscribe?"><span>Subscribe now</span></a></p><p></p></li></ul><p>Most importantly, models will be frequently rebuilt: We&#8217;ll keep rebuilding as research evolves. Because only one thing is certain about today&#8217;s state-of-the-art LLM&#8217;s: If you train one today, it will be outdated in 3-6 months.</p><p>If you&#8217;re a John Snow Labs customer, all these capabilities are included in your Healthcare NLP subscription. Install the new 4.4 release and give it a go. If you&#8217;d like to learn more, join the next webinar on&nbsp;<a href="https://ccpq104.na1.hubspotlinks.com/Ctc/2H+113/ccpq104/VWvF1v8jkpM5W8B86v480wjxBW2sbCLr4ZkgSKN8NyyFp3lScmV1-WJV7Cg-gfN7P1cTVG-FkHW7HDd6B67j-TYW5HDdFb4bJNn9W29z9sX89S0nzW3yKMs599cv-GW1QSB7S7WKGS8V8cS_r4wgmJ7W1ghqHs7yFcMJW8y0D793z-YJ1W3Y9fH0551qQ4W78Ss982Q8w7JW5F087S2R2QR9VftQBY2CJY-sW8YfCCq1253tdW5MQQj73k44KDW62B2MB15X62MW80KpRb6Q1QLYV6Mp6Y1gfbj6N1Y1Rr2hrR2pW5bMGgk1l8P0fW52C2BN4NG8djW6qbh6n7dtNh_W5p_P-r5m-jpVW5z7qsz4RLT2VW3R9lbL7RWfKxN4vD5NhcRR9s3k_R1">automated summarization of clinical notes</a>&nbsp;on April 26<sup>th</sup>.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://aiinhealthcare.substack.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share AI in Healthcare&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://aiinhealthcare.substack.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share AI in Healthcare</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[An early evaluation of ChatGPT on common medical NLP tasks]]></title><description><![CDATA[It is the early days for large language models. John Snow Labs will provide you the benefits of these models as they&#8217;re reliable and ready for prime time in the healthcare & life science industries.]]></description><link>https://www.talby.com/p/an-early-evaluation-of-chatgpt-on</link><guid isPermaLink="false">https://www.talby.com/p/an-early-evaluation-of-chatgpt-on</guid><dc:creator><![CDATA[David Talby]]></dc:creator><pubDate>Mon, 20 Mar 2023 16:51:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60289ce8-5c40-47cd-b43f-e7094c8190ce_1456x1048.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iYsd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60289ce8-5c40-47cd-b43f-e7094c8190ce_1456x1048.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iYsd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60289ce8-5c40-47cd-b43f-e7094c8190ce_1456x1048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!iYsd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60289ce8-5c40-47cd-b43f-e7094c8190ce_1456x1048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!iYsd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60289ce8-5c40-47cd-b43f-e7094c8190ce_1456x1048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!iYsd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60289ce8-5c40-47cd-b43f-e7094c8190ce_1456x1048.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iYsd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60289ce8-5c40-47cd-b43f-e7094c8190ce_1456x1048.jpeg" width="1456" height="1048" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/60289ce8-5c40-47cd-b43f-e7094c8190ce_1456x1048.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1048,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:76644,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!iYsd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60289ce8-5c40-47cd-b43f-e7094c8190ce_1456x1048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!iYsd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60289ce8-5c40-47cd-b43f-e7094c8190ce_1456x1048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!iYsd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60289ce8-5c40-47cd-b43f-e7094c8190ce_1456x1048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!iYsd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60289ce8-5c40-47cd-b43f-e7094c8190ce_1456x1048.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>Motivation</strong></h2><p>John Snow Labs&#8217; main promise to the healthcare industry is that we will keep you at the state of the art. We&#8217;ve reimplemented our core algorithms every year since 2017 &#8211; migrating to BERT, then BioBERT, then our own fine-tuned language models, then token classification &amp; sequence classification models, then zero-shot learning, and recently end-to-end visual document understanding and speech recognition. Our biggest customers work for us to be future-proof &#8211; because unlike others we do not advocate of stick to a specific technology or approach, but instead evolve quickly to productize the best-performing techniques as they become available.</p><p>As such, we regularly try and benchmark new papers, models, libraries, or services that come out claiming new capabilities in healthcare NLP. This includes recently released models like <a href="https://openai.com/blog/chatgpt/">ChatGPT</a> and <a href="https://arxiv.org/abs/2210.10341">BioGPT</a>. Since we get asked about them a lot, this blog post summarizes early findings in benchmarking them versus current state-of-the-art models for <a href="https://www.johnsnowlabs.com/healthcare-nlp/">medical natural language processing</a> tasks: named entity recognition, relation extraction, assertion status detection, entity resolution, and de-identification.</p><p><strong>TL;DR: We do not recommend these models for production use today</strong>. They are impressive research advances, and we use them internally to bootstrap smaller and more accurate models, but they are not fit for the vast majority of real-world use cases.</p><p></p><h2><strong>What are the issues?</strong></h2><p>Simply put, ChatGPT doesn&#8217;t do what you need it to:</p><ol><li><p>In our internal evaluations of such models, they significantly <strong>lag in accuracy</strong> compared to current state-of-the-art models. Precision ranged between 0.66 to 0.86, and recall was particularly problematic at between 0.40 and 0.52. This means that the human abstractors you have in place will still have to read the entire documents, resulting in minimal time &amp; cost savings for you.</p></li><li><p>There is <strong>no way to tune</strong> and provide feedback to these models. While they are very fast to bootstrap, you cannot tune ChatGPT, meaning that these models won&#8217;t improve over time based on feedback from your abstractors (or the many historical documents you&#8217;ve already abstracted). This is critical in healthcare systems, where models have to be localized due to differences in clinical guidelines, writing styles, and business processes.</p></li><li><p>These models are <strong>far slower and more expensive to run</strong> than their productized &amp; more accurate counterparts. The cost of a single ChatGPT query is <a href="https://www.semianalysis.com/p/the-inference-cost-of-search-disruption">estimate to be $0.36</a>. Given the length of typical patient stories, this implies paying tens of dollars in computing costs alone to analyze a single cancer patient&#8217;s story. Beyond hardware costs, there is also the issue of clock time: For example, reproducing clinical NER benchmarks on Facebook&#8217;s Galactica model required 4 hours to process a single 2-page note on a machine with 8 GPUs.</p></li><li><p>These models are <strong>not regularly updated.</strong> They are typically not retrained (new versions come out instead), or retrained annually. This means that you&#8217;ll be missing new clinical terms, medications, and guidelines &#8211; with no ability to tune or train these models.</p></li><li><p>There is <strong>no support for visual documents </strong>&#8211; operating on scanned documents or images at all. This means that a portion of the work that is often required in real-world use cases &#8211; like <a href="https://www.johnsnowlabs.com/clinical-data-abstraction-from-unstructured-documents-using-nlp/">clinical abstraction</a>, clinical decision support, or real-world data &#8211; will remain manual, and as a result, the overall result will require a separate OCR pipeline or have to remain manual (since the models won&#8217;t be able to consider both text &amp; images together when providing answers or recommendations).</p></li><li><p>There is <strong>no pre-processing pipeline</strong>. Clinical text like EHR records includes about 50% copy-and-pasted content, sections, and multiple pages. This has to be normalized first; note that the exact same sentence can mean different things if it&#8217;s under &#8220;chief complaint&#8221;, &#8220;history of present illness&#8221;, or &#8220;plan&#8221;. You&#8217;ll need to build that yourself as well, instead of using a pre-built &amp; widely validated solution. Like scanned documents, this will also have to be built in a custom way outside the large language models.</p></li><li><p>Models like GPT-3 or ChatGPT <strong>require calling a cloud API</strong> &#8211; and sharing your data with the company providing them. Even if the setup becomes HIPAA compliant, and even if you&#8217;re allowed to share the data that way, you&#8217;re providing that company with the intellectual property needed to train &amp; tune better clinical models, instead of building that intellectual property internally by privately tuning your own oncology abstraction models (which is how John Snow Labs&#8217; software &amp; license works).</p></li></ol><p></p><h2><strong>Zero-Shot Learning</strong></h2><p>The good news is that <strong>you can get the benefits of prompt engineering right now</strong> with John Snow Labs, without these downsides. There are three first-to-market features that are already available and in use by early adopters to build production-grade, accurate, tunable, scalable, private, cheaper to run, kept current, healthcare-tuned, and compliant NLP solutions.</p><p>First, <a href="https://nlp.johnsnowlabs.com/docs/en/transformers#zeroshotner">Zero-shot named entity recognition</a> and <a href="https://nlp.johnsnowlabs.com/docs/en/licensed_annotators#zeroshotrelationextractionmodel">zero-shot relation extraction</a> enable you to extract custom entities and relationships from medical text without any training, tuning, or data labeling. This is useful when your goal is to optimize go-to-market time over accuracy &#8211; i.e. can I get a model that&#8217;s 80% accurate today, instead of a model that&#8217;s 95% accurate in 3 months? For example, if you&#8217;re automating the process of creating a cancer registry, then you may wish to invest to optimize the models for fields relating to tumor staging &amp; histology, but go with zero-shot models for the 400+ rarely filled data fields.</p><p>Models based on Longformer, Albert, Bert, CamemBert, DeBerta, DistillBert, Roberta, and XlmRoberta have been implemented already. John Snow Labs has already progressed a step further than enabling prompt engineering with the automated prompt generation, based on the T5 transformer. This functionality is currently not available in Hugging Face, which only supports zero-shot text classification.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!TyA8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F632ca983-c95e-477a-a1e0-a2cf1426849b_815x535.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!TyA8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F632ca983-c95e-477a-a1e0-a2cf1426849b_815x535.png 424w, https://substackcdn.com/image/fetch/$s_!TyA8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F632ca983-c95e-477a-a1e0-a2cf1426849b_815x535.png 848w, https://substackcdn.com/image/fetch/$s_!TyA8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F632ca983-c95e-477a-a1e0-a2cf1426849b_815x535.png 1272w, https://substackcdn.com/image/fetch/$s_!TyA8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F632ca983-c95e-477a-a1e0-a2cf1426849b_815x535.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!TyA8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F632ca983-c95e-477a-a1e0-a2cf1426849b_815x535.png" width="815" height="535" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/632ca983-c95e-477a-a1e0-a2cf1426849b_815x535.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:535,&quot;width&quot;:815,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Examples of prompts for zero-shot named entity recognition (top) and relation extraction (bottom)&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Examples of prompts for zero-shot named entity recognition (top) and relation extraction (bottom)" title="Examples of prompts for zero-shot named entity recognition (top) and relation extraction (bottom)" srcset="https://substackcdn.com/image/fetch/$s_!TyA8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F632ca983-c95e-477a-a1e0-a2cf1426849b_815x535.png 424w, https://substackcdn.com/image/fetch/$s_!TyA8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F632ca983-c95e-477a-a1e0-a2cf1426849b_815x535.png 848w, https://substackcdn.com/image/fetch/$s_!TyA8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F632ca983-c95e-477a-a1e0-a2cf1426849b_815x535.png 1272w, https://substackcdn.com/image/fetch/$s_!TyA8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F632ca983-c95e-477a-a1e0-a2cf1426849b_815x535.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h2><strong>No-Code Prompt Engineering</strong></h2><p>A second use for zero-shot prompt engineering is to bootstrap higher-accuracy models. Instead of labeling data from scratch, you can pre-annotate data with prompts, after which your domain experts only need to correct what it got wrong. This is similar to what we&#8217;ve done with programmatic labeling &#8211;we&#8217;ve added it to the <a href="https://www.johnsnowlabs.com/nlp-lab/">NLP Lab</a> as another way to bootstrap models, but not as a full replacement.</p><p><a href="https://www.johnsnowlabs.com/watch-combining-prompt-engineering-programmatic-labelling-and-model-tuning-in-the-no-code-nlp-lab/">The NLP Lab lets you seamlessly combine models (transfer learning), rules (programmatic labeling), and prompts (zero-shot learning) to bootstrap NLP model development</a>. This typically makes labeling projects 80%-90% faster than &#8220;from scratch&#8221; projects. Importantly, the ability to combine models, rules, and prompts for different tasks gives you the best of all worlds, in contrast to systems that focus on AI-assisted labeling with models (i.e. LabelBox), rules (i.e. Snorkel), or prompts (i.e. ChatGPT).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4Fqz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5cf6f251-586f-4c74-bf04-2163bfad73c4_1426x757.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4Fqz!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5cf6f251-586f-4c74-bf04-2163bfad73c4_1426x757.png 424w, https://substackcdn.com/image/fetch/$s_!4Fqz!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5cf6f251-586f-4c74-bf04-2163bfad73c4_1426x757.png 848w, https://substackcdn.com/image/fetch/$s_!4Fqz!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5cf6f251-586f-4c74-bf04-2163bfad73c4_1426x757.png 1272w, https://substackcdn.com/image/fetch/$s_!4Fqz!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5cf6f251-586f-4c74-bf04-2163bfad73c4_1426x757.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4Fqz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5cf6f251-586f-4c74-bf04-2163bfad73c4_1426x757.png" width="1426" height="757" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5cf6f251-586f-4c74-bf04-2163bfad73c4_1426x757.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:757,&quot;width&quot;:1426,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!4Fqz!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5cf6f251-586f-4c74-bf04-2163bfad73c4_1426x757.png 424w, https://substackcdn.com/image/fetch/$s_!4Fqz!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5cf6f251-586f-4c74-bf04-2163bfad73c4_1426x757.png 848w, https://substackcdn.com/image/fetch/$s_!4Fqz!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5cf6f251-586f-4c74-bf04-2163bfad73c4_1426x757.png 1272w, https://substackcdn.com/image/fetch/$s_!4Fqz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5cf6f251-586f-4c74-bf04-2163bfad73c4_1426x757.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The NLP Lab is the first to market with a user interface intended for non-technical domain experts (i.e. medical doctors use it to train &amp; tune models) that allows you to start with a prompt (or a pre-trained model, or a rule), see how well it performs on real data, provide feedback as needed, and publish that model. That is how you can quickly scale this effort to support a broad range of document types, medical contexts, and entities &amp; relationships to extract. We see other customers already doing the same, and we&#8217;ve been doing this internally for a while now. There&#8217;s no point in reinventing the wheel by starting from a cloud API and rebuilding this entire workflow and user experience.</p><p></p><h2><strong>Zero-Shot Visual Question Answering</strong></h2><p>Zero-shot visual question answering is also already available. This model is based on the architecture of <a href="https://arxiv.org/abs/2111.15664">Donut: an OCR-free Document Understading Transformer</a> that can answer questions (i.e. do fact extraction) directly from an image or visual document. This does not require any training or tuning. As of October 2022, this architecture delivers state-of-the-art accuracy on a variety of visual document understanding benchmarks covering receipts, invoices, tickets, letters, memos, emails, and business cards &#8211; in English, Chinese, Japanese, and Korean.</p><p>The supported visual NLP tasks are document classification, information extraction, and question answering. For example, the model can be provided in this image:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!IDwh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39d9720e-6124-4e6a-80be-084805da47a2_637x640.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!IDwh!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39d9720e-6124-4e6a-80be-084805da47a2_637x640.png 424w, https://substackcdn.com/image/fetch/$s_!IDwh!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39d9720e-6124-4e6a-80be-084805da47a2_637x640.png 848w, https://substackcdn.com/image/fetch/$s_!IDwh!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39d9720e-6124-4e6a-80be-084805da47a2_637x640.png 1272w, https://substackcdn.com/image/fetch/$s_!IDwh!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39d9720e-6124-4e6a-80be-084805da47a2_637x640.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!IDwh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39d9720e-6124-4e6a-80be-084805da47a2_637x640.png" width="637" height="640" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/39d9720e-6124-4e6a-80be-084805da47a2_637x640.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:640,&quot;width&quot;:637,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!IDwh!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39d9720e-6124-4e6a-80be-084805da47a2_637x640.png 424w, https://substackcdn.com/image/fetch/$s_!IDwh!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39d9720e-6124-4e6a-80be-084805da47a2_637x640.png 848w, https://substackcdn.com/image/fetch/$s_!IDwh!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39d9720e-6124-4e6a-80be-084805da47a2_637x640.png 1272w, https://substackcdn.com/image/fetch/$s_!IDwh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39d9720e-6124-4e6a-80be-084805da47a2_637x640.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>And then asked to answer these two questions:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!e687!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F130c6976-3bf0-4660-9bf8-80766dd3d857_2118x114.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!e687!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F130c6976-3bf0-4660-9bf8-80766dd3d857_2118x114.png 424w, https://substackcdn.com/image/fetch/$s_!e687!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F130c6976-3bf0-4660-9bf8-80766dd3d857_2118x114.png 848w, https://substackcdn.com/image/fetch/$s_!e687!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F130c6976-3bf0-4660-9bf8-80766dd3d857_2118x114.png 1272w, https://substackcdn.com/image/fetch/$s_!e687!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F130c6976-3bf0-4660-9bf8-80766dd3d857_2118x114.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!e687!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F130c6976-3bf0-4660-9bf8-80766dd3d857_2118x114.png" width="1456" height="78" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/130c6976-3bf0-4660-9bf8-80766dd3d857_2118x114.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:78,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:31876,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!e687!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F130c6976-3bf0-4660-9bf8-80766dd3d857_2118x114.png 424w, https://substackcdn.com/image/fetch/$s_!e687!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F130c6976-3bf0-4660-9bf8-80766dd3d857_2118x114.png 848w, https://substackcdn.com/image/fetch/$s_!e687!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F130c6976-3bf0-4660-9bf8-80766dd3d857_2118x114.png 1272w, https://substackcdn.com/image/fetch/$s_!e687!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F130c6976-3bf0-4660-9bf8-80766dd3d857_2118x114.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Without any training or tuning, it will provide these two answers:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VAee!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5661259-6e76-408d-bccf-acbd30897d0d_2104x108.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VAee!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5661259-6e76-408d-bccf-acbd30897d0d_2104x108.png 424w, https://substackcdn.com/image/fetch/$s_!VAee!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5661259-6e76-408d-bccf-acbd30897d0d_2104x108.png 848w, https://substackcdn.com/image/fetch/$s_!VAee!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5661259-6e76-408d-bccf-acbd30897d0d_2104x108.png 1272w, https://substackcdn.com/image/fetch/$s_!VAee!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5661259-6e76-408d-bccf-acbd30897d0d_2104x108.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VAee!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5661259-6e76-408d-bccf-acbd30897d0d_2104x108.png" width="1456" height="75" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c5661259-6e76-408d-bccf-acbd30897d0d_2104x108.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:75,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:25989,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!VAee!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5661259-6e76-408d-bccf-acbd30897d0d_2104x108.png 424w, https://substackcdn.com/image/fetch/$s_!VAee!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5661259-6e76-408d-bccf-acbd30897d0d_2104x108.png 848w, https://substackcdn.com/image/fetch/$s_!VAee!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5661259-6e76-408d-bccf-acbd30897d0d_2104x108.png 1272w, https://substackcdn.com/image/fetch/$s_!VAee!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5661259-6e76-408d-bccf-acbd30897d0d_2104x108.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Note that these questions require visual understanding in addition to reading the text: The model should implicitly deduce that this image is an agenda of an event, that it&#8217;s most likely that the times on the left column state when each event happens, and that if what looks like a person&#8217;s name appears next to a topic, it&#8217;s most likely that this person is the speaker for that session. This &#8220;common sense&#8221; knowledge is available out of the box &#8211; in a production-grade, scalable, and private library.</p><p></p><h2><strong>What Next?</strong></h2><p>It is the early days for large language models. We expect high-speed innovation to continue &#8211; with new entrants building on GPT3, DALL-E, and ChatGPT flooding the commercial &amp; open-source arenas. John Snow Labs will provide you the benefits of these models as you as they&#8217;re reliable and ready for prime time in the healthcare &amp; life science industries. We highly recommend that you start using what&#8217;s currently available &#8211; and welcome feedback and requests. Prompt engineering, No-code, and Responsible AI are the three major NLP trends we&#8217;re focused on in 2023 and you&#8217;ll see much more in all three areas in the software we&#8217;re building for you this year.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://aiinhealthcare.substack.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share AI in Healthcare&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://aiinhealthcare.substack.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share AI in Healthcare</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[3 Pragmatic Differences Between Academic And Production Software Libraries]]></title><description><![CDATA[Choosing the right AI library is crucial for success. Academic libraries prioritize reproducibility, while industry-focused libraries prioritize production-grade code.]]></description><link>https://www.talby.com/p/3-pragmatic-differences-between-academic</link><guid isPermaLink="false">https://www.talby.com/p/3-pragmatic-differences-between-academic</guid><dc:creator><![CDATA[David Talby]]></dc:creator><pubDate>Wed, 15 Mar 2023 06:29:00 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/727a7332-c7e5-434a-b465-48265fe5623e_1456x1048.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!V1mp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46e038eb-428f-4774-b772-e5cad096d58a_1090x562.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!V1mp!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46e038eb-428f-4774-b772-e5cad096d58a_1090x562.jpeg 424w, https://substackcdn.com/image/fetch/$s_!V1mp!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46e038eb-428f-4774-b772-e5cad096d58a_1090x562.jpeg 848w, https://substackcdn.com/image/fetch/$s_!V1mp!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46e038eb-428f-4774-b772-e5cad096d58a_1090x562.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!V1mp!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46e038eb-428f-4774-b772-e5cad096d58a_1090x562.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!V1mp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46e038eb-428f-4774-b772-e5cad096d58a_1090x562.jpeg" width="1090" height="562" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/46e038eb-428f-4774-b772-e5cad096d58a_1090x562.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:562,&quot;width&quot;:1090,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:180712,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!V1mp!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46e038eb-428f-4774-b772-e5cad096d58a_1090x562.jpeg 424w, https://substackcdn.com/image/fetch/$s_!V1mp!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46e038eb-428f-4774-b772-e5cad096d58a_1090x562.jpeg 848w, https://substackcdn.com/image/fetch/$s_!V1mp!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46e038eb-428f-4774-b772-e5cad096d58a_1090x562.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!V1mp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46e038eb-428f-4774-b772-e5cad096d58a_1090x562.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Image credit: https://ssl.engineering.nyu.edu/blog/2019-09-03-bridging-pt2</figcaption></figure></div><p></p><p>Starting a new AI project will often confront you with <a href="https://en.wikipedia.org/wiki/The_Paradox_of_Choice">the paradox of choice</a>:&nbsp; there are too many great libraries and models to start from. One way to avoid analysis paralysis due to feature-by-feature comparisons is to focus on tools that were designed for your kind of project.</p><p>At my company, John Snow Labs, we often get compared to Allen NLP, Stanza, SciSpacy, and other libraries focused on academic use cases. Many AI libraries started in academia to help researchers write papers faster, while others were created specifically to help enterprises build production systems.</p><p>These are very different communities which result in different design decisions and priorities. Some academic libraries may indeed go on to become mainstream, but there are fundamental differences that should be considered depending on your goals.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.talby.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.talby.com/subscribe?"><span>Subscribe now</span></a></p><p></p><h2><strong>Reproducibility Vs. Freshness</strong></h2><p>Models perform differently on academic datasets versus real-world data. In industry, you need current, state-of-the-art models to succeed, and these models have to be regularly updated.</p><p>Take BioBERT, a pre-trained biomedical language representation model for biomedical text mining. This is an adaptation of BERT (Bidirectional Encoder Representations from Transformers), a neural network-based technique for natural language processing (NLP) pre-training, specifically for biomedical use cases. You want BioBERT pretrained on a regular basis on the latest research, not only on general English but biomedical language.</p><p>BioBERT was&nbsp;<a href="https://academic.oup.com/bioinformatics/article/36/4/1234/5566506">trained in early 2019</a>&#8212;and as we know, a lot has happened in healthcare and society since then. BioBERT considers &#8220;Covid-19&#8221; to be an unrecognized, out-of-vocabulary keyword. This isn&#8217;t a problem if you&#8217;re only using BioBERT to reproduce old papers and results&#8212;in fact, having a frozen model is a requirement for such reproducibility&#8212;but imagine using such a model in a production system?</p><p>Medical terminologies and practices keep evolving: If you have a model that identifies drug names in a medical text, you need it updated nearly weekly in order to track new drugs that come to market. The same goes for diseases, procedures, medical devices, biomarkers, antibodies, surgical techniques, and other terms.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!r14P!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb423788-0e96-41fe-80a2-09ede652b423_1456x1048.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!r14P!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb423788-0e96-41fe-80a2-09ede652b423_1456x1048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!r14P!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb423788-0e96-41fe-80a2-09ede652b423_1456x1048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!r14P!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb423788-0e96-41fe-80a2-09ede652b423_1456x1048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!r14P!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb423788-0e96-41fe-80a2-09ede652b423_1456x1048.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!r14P!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb423788-0e96-41fe-80a2-09ede652b423_1456x1048.jpeg" width="1456" height="1048" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bb423788-0e96-41fe-80a2-09ede652b423_1456x1048.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1048,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:101032,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!r14P!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb423788-0e96-41fe-80a2-09ede652b423_1456x1048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!r14P!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb423788-0e96-41fe-80a2-09ede652b423_1456x1048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!r14P!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb423788-0e96-41fe-80a2-09ede652b423_1456x1048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!r14P!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb423788-0e96-41fe-80a2-09ede652b423_1456x1048.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p></p><h2><strong>Production-Grade Codebase Vs. Fast Prototyping</strong></h2><p>Production-grade software implies code that has strong test coverage, automated CI/CD infrastructure, regular tests for security vulnerabilities, a release process that ensures that faulty or malicious software can&#8217;t be slipped into the codebase, and a focus on optimizing speed, memory consumption, and compatibility with major cloud providers and compute platforms. In contrast, research frameworks are focused on speed of prototyping, which leads to very different software designs and processes.</p><p>For example, in October 2021, researchers from&nbsp;<a href="https://www.marktechpost.com/2021/10/30/google-research-introduces-scenic-an-open-source-jax-library-for-computer-vision-research/">Google introduced SCENIC</a>, an open-source JAX library with a focus on Transformer-based models for computer vision research. Its aim is to make large-scale model prototyping faster and thus easier for people to make small changes and write papers.</p><p>Historically, research libraries like SCENIC have been very successful at prioritizing rapid prototyping. This enables the creation of product simulations for testing and validation during the product development process.</p><p>Here is&nbsp;<a href="https://github.com/google-research/scenic#philosophy:~:text=baseline%20models.-,Philosophy,-Scenic%20aims%20to">the "Philosophy" section</a>&nbsp;from the project&#8217;s GitHub homepage: &#8220;<em>Scenic</em>&nbsp;aims to facilitate rapid prototyping of large-scale vision models. To keep the code simple to understand and extend, we prefer&nbsp;<em>forking and copy-pasting over adding complexity or increasing abstraction</em>. Only when functionality proves to be widely useful across many models and tasks, it may be upstreamed to Scenic's shared libraries.&#8221;</p><p>SCENIC has been successful precisely because it has made explicit trade-offs to achieve its goal&#8212;helping researchers move faster instead of building a reusable and well-abstracted codebase. It&#8217;s another example where an academic-focused library is not fit for production systems, not because it&#8217;s poorly designed but because it is well designed and managed to achieve a different goal.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.linkedin.com/groups/7010492/&quot;,&quot;text&quot;:&quot;Join The LinkedIn Group&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.linkedin.com/groups/7010492/"><span>Join The LinkedIn Group</span></a></p><p></p><h2><strong>Roadmap Prioritization</strong></h2><p>A third major difference between academic- and industry-focused libraries is what they prioritize. For example, in an academic setting, you&#8217;ll want to run it against other standard academic benchmarks when you train a new model. Being able to run your model versus the entire&nbsp;<a href="https://super.gluebenchmark.com/">SuperGLUE</a>&nbsp;benchmark for natural language understanding in one line of code and easily reproducing results from other models on different metrics is an amazing feature. Having additional helper scripts that organize the output and provide detailed comparisons to other models is also very useful.</p><p>In contrast, enterprise customers don&#8217;t care about this. They care about reliability, scalability, cost, security, and compliance. What type of data will you have to share to get your AI project off the ground, and what protective measures are in place? Do they meet regulations such as GDPR, CCPA, or industry-specific laws like HIPAA? How will you factor in explainability and avoid bias or concept drift over time? How will monitoring take place? What are the versioning and release processes, and do they integrate with enterprise-wide tools? How would the processes of training, tuning and inference of the model integrate as part of the overall enterprise architecture?</p><p></p><h2><strong>Two Communities, Two Needs</strong></h2><p>Ultimately, there are major technical gaps between building a model and getting it ready for use in real-world products and services. It is also largely a software engineering effort, not a data science effort, and the right skill sets must be involved.</p><p>In practice, there are two different communities that need to be served&#8212;those in academia and those in industry. For enterprise AI users especially, it would seem that picking the right library before the right tool is the best way to ensure your AI projects have the greatest chance of success.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://aiinhealthcare.substack.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share AI in Healthcare&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://aiinhealthcare.substack.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share AI in Healthcare</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Applying Responsible NLP in Real-World Projects]]></title><description><![CDATA[The underlying principles behind the NLP Test library: Enabling data scientists to deliver reliable, safe and effective language models.]]></description><link>https://www.talby.com/p/applying-responsible-nlp-in-real</link><guid isPermaLink="false">https://www.talby.com/p/applying-responsible-nlp-in-real</guid><dc:creator><![CDATA[David Talby]]></dc:creator><pubDate>Mon, 20 Feb 2023 16:46:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!gH8H!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46c5b145-6515-41ed-80f8-cd53a9ca1a94_1456x1048.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gH8H!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46c5b145-6515-41ed-80f8-cd53a9ca1a94_1456x1048.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gH8H!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46c5b145-6515-41ed-80f8-cd53a9ca1a94_1456x1048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!gH8H!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46c5b145-6515-41ed-80f8-cd53a9ca1a94_1456x1048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!gH8H!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46c5b145-6515-41ed-80f8-cd53a9ca1a94_1456x1048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!gH8H!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46c5b145-6515-41ed-80f8-cd53a9ca1a94_1456x1048.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gH8H!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46c5b145-6515-41ed-80f8-cd53a9ca1a94_1456x1048.jpeg" width="1456" height="1048" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/46c5b145-6515-41ed-80f8-cd53a9ca1a94_1456x1048.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1048,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:3038318,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gH8H!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46c5b145-6515-41ed-80f8-cd53a9ca1a94_1456x1048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!gH8H!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46c5b145-6515-41ed-80f8-cd53a9ca1a94_1456x1048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!gH8H!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46c5b145-6515-41ed-80f8-cd53a9ca1a94_1456x1048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!gH8H!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46c5b145-6515-41ed-80f8-cd53a9ca1a94_1456x1048.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h2><strong>Responsible AI: Getting from Goals to Daily Practices</strong></h2><p>How is it possible to develop AI models that are transparent, safe, and equitable? As AI impacts more aspects of our daily lives, concerns about discrimination, privacy, and bias are on the rise. The good news is that there is a growing movement towards Responsible AI with the goal of ensuring that models are designed and deployed in ways that align with ethical principles, which include [<a href="https://www.nist.gov/trustworthy-and-responsible-ai">NIST 2023</a>]:</p><p><strong>Validity and Reliability: </strong>Developers should take steps to ensure that models perform as they should under a variety of circumstances.</p><p><strong>Security and Resiliency:</strong> Models should show robustness to data and context that is different than what they were trained on or different than what is normally expected. Models should not violate system or personal security or enable security violations.</p><p><strong>Explainability and Interpretability: </strong>Models should be capable of answering stakeholder questions about the decision-making processes of <em>AI systems.</em></p><p><strong>Fairness with Mitigation of Harmful Bias: </strong>Models should be designed to avoid bias and ensure equitable treatment of all individuals and groups impacted by the data. This includes ensuring the proper representation of protected groups in the dataset.</p><p><strong>Privacy:</strong> Data privacy and security should be prioritized in all stages of the AI pipeline. This includes both respecting the rights of people who do not wish to be included in training data, as well as preventing leakage of private data by a model.</p><p><strong>Safety:</strong> Models should be designed to avoid harm to users and mitigate potential risks. This includes model behavior in unexpected conditions or edge cases.</p><p><strong>Accountability:</strong> Developers should be accountable for the impact of their models on society and should take steps to address any negative consequences.</p><p><strong>Transparency:</strong> Developers should be transparent about the data sources, model design, and potential limitations or biases of the models.</p><p>However, today there is a gap between these principles and current state-of-the-art NLP models.</p><p>According to [<a href="https://arxiv.org/abs/2005.04118">Ribeiro 2020</a>], the sentiment analysis services of the top three cloud providers fail 9-16% of the time when replacing neutral words, &nbsp;and 7-20% of the time when changing neutral named entities. These systems also failed 36-42% of the time on temporal tests and almost 100% of the time on some negation tests. Personal information leakage has been shown to be as high as 50-70% in popular word and sentence embeddings, according to [<a href="https://arxiv.org/abs/2004.00053">Song &amp; Raghunathan 2020</a>]. In addition, state-of-the-art question-answering models have been shown to exhibit biases around race, gender, physical appearance, disability, and religion [<a href="https://arxiv.org/abs/2110.08193">Parrish et. al. 2021</a>] &#8211; sometimes changing the likely answer more than 80% of the time. Finally, [<a href="https://arxiv.org/abs/2111.15512">van Aken et. al. 2022</a>] showed that adding any mention of ethnicity to a patient note reduces their predicted risk of mortality &#8211; with the most accurate model producing the largest error.</p><p>These findings suggest that the current NLP systems are unreliable and flawed. We would not accept a calculator that correctly works at a particular time or a microwave that randomly alters its strength based on the type of food or time of day. Therefore, a well-engineered production system should work reliably on standard inputs and be safe &amp; robust when handling uncommon ones. The three fundamental software engineering principles can help us get there.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.talby.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.talby.com/subscribe?"><span>Subscribe now</span></a></p><p></p><h2><strong>Software Engineering Fundamentals</strong></h2><p><strong>Testing software</strong> is crucial to ensure it works as intended. The reason why NLP models often fail is straightforward: they are not tested enough. While recent research papers have shed light on this issue, testing should be standard practice before deploying any software to production. Furthermore, testing should be carried out every time the software is changed, as NLP models can also regress over time [<a href="https://arxiv.org/abs/2105.03048">Xie et. al. 2021</a>].</p><p>Even though most academics make their models publicly available and easily reusable, it is <strong>not recommended to reuse academic models as production-ready ones</strong>. It is because tools that are designed to reproduce research results may not be suitable for production use. This makes research faster and enables benchmarks like <a href="https://super.gluebenchmark.com/">SuperGLUE</a>, <a href="https://github.com/EleutherAI/lm-evaluation-harness">LM-Harness</a>, and <a href="https://github.com/google/BIG-bench">BIG-bench</a>. Reproducibility requires that models remain the same rather than being continuously updated and improved. For example, <a href="https://arxiv.org/abs/1901.08746">BioBERT</a>, a commonly used biomedical embedding model, was published in early 2019 and did not consider COVID-19 as a vocabulary word due to its release date. This illustrates how relying solely on academic models may hinder the effectiveness of NLP systems in production environments.</p><p>It is important to <strong>test beyond accuracy</strong> in your NLP system. This is because the business requirements for the system include robustness, reliability, fairness, toxicity, efficiency, lack of bias, lack of data leakage, and safety. Therefore, your test suites should reflect these requirements. A comprehensive review of definitions and metrics for these terms in different contexts is provided in the <a href="https://arxiv.org/abs/2211.09110">Holistic Evaluation of Language Models</a> [Liang et. al 2022], which is well worth reading. However, you will need to write your own tests to determine what inclusiveness means for your specific application.</p><p>Your tests should be specific, isolated, and easy to maintain, as well as versioned and executable so that they can be incorporated into an automated build or MLOps workflow. To simplify this process, you can use the nlptest library, which is a straightforward framework.</p><p></p><h2><strong>Design Principles of the NLP Test Library</strong></h2><p>Designed around five principles, the <a href="https://nlptest.org/">nlptest library</a> is intended to make it easier for data scientists to deliver reliable, safe, and effective language models.</p><p><strong>Open Source</strong>. It is an open-source community project under the Apache 2.0 license, free to use forever for commercial and non-commercial purposes with no caveats. It has an active development team that welcomes contributions and code forks.</p><p><strong>Lightweight</strong>. The library is lightweight and can run offline (i.e., in a VPN or a high-compliance enterprise environment) on a laptop, eliminating the need for a high-memory server, cluster, or GPU. Installation is as simple as running pip install nlptest, and generating and running tests can be done in just three lines of code.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!P2ce!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38d785b4-eecc-4347-a461-d9556de7061c_2154x208.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!P2ce!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38d785b4-eecc-4347-a461-d9556de7061c_2154x208.png 424w, https://substackcdn.com/image/fetch/$s_!P2ce!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38d785b4-eecc-4347-a461-d9556de7061c_2154x208.png 848w, https://substackcdn.com/image/fetch/$s_!P2ce!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38d785b4-eecc-4347-a461-d9556de7061c_2154x208.png 1272w, https://substackcdn.com/image/fetch/$s_!P2ce!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38d785b4-eecc-4347-a461-d9556de7061c_2154x208.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!P2ce!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38d785b4-eecc-4347-a461-d9556de7061c_2154x208.png" width="728" height="70.5" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/38d785b4-eecc-4347-a461-d9556de7061c_2154x208.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;normal&quot;,&quot;height&quot;:141,&quot;width&quot;:1456,&quot;resizeWidth&quot;:728,&quot;bytes&quot;:44786,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!P2ce!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38d785b4-eecc-4347-a461-d9556de7061c_2154x208.png 424w, https://substackcdn.com/image/fetch/$s_!P2ce!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38d785b4-eecc-4347-a461-d9556de7061c_2154x208.png 848w, https://substackcdn.com/image/fetch/$s_!P2ce!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38d785b4-eecc-4347-a461-d9556de7061c_2154x208.png 1272w, https://substackcdn.com/image/fetch/$s_!P2ce!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38d785b4-eecc-4347-a461-d9556de7061c_2154x208.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>By importing the library, creating a new test harness for the specified Named Entity Recognition (NER) model from John Snow Labs&#8217; NLP models hub, and running the code, the library automatically generates test cases ((based on the default configuration) and generates a report, simplifying the process for data scientists.</p><p>Storing tests in a pandas data frame makes it simple to edit, filter, import, or export them. The entire test harness can be saved and loaded, allowing you to run a regression test of a previously configured test suite simply by calling h.load(&#8220;filename&#8221;).run().</p><p><strong>Cross Library</strong>. The framework provides out-of-the-box support for <a href="https://huggingface.co/docs/transformers/index">transformers</a>, <a href="https://nlp.johnsnowlabs.com/">Spark NLP</a>, and <a href="https://spacy.io/">spacy</a>, and can be easily extended to support additional libraries. As an AI community, there is no need for us to build the test generation and execution engines multiple times. It allows testing of both pre-trained and custom NLP pipelines from any of these libraries.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LxzA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21d1f4c1-4c31-4b86-bbc5-3ae754a4ea45_2096x314.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LxzA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21d1f4c1-4c31-4b86-bbc5-3ae754a4ea45_2096x314.png 424w, https://substackcdn.com/image/fetch/$s_!LxzA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21d1f4c1-4c31-4b86-bbc5-3ae754a4ea45_2096x314.png 848w, https://substackcdn.com/image/fetch/$s_!LxzA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21d1f4c1-4c31-4b86-bbc5-3ae754a4ea45_2096x314.png 1272w, https://substackcdn.com/image/fetch/$s_!LxzA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21d1f4c1-4c31-4b86-bbc5-3ae754a4ea45_2096x314.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LxzA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21d1f4c1-4c31-4b86-bbc5-3ae754a4ea45_2096x314.png" width="1456" height="218" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/21d1f4c1-4c31-4b86-bbc5-3ae754a4ea45_2096x314.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:218,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:114302,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LxzA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21d1f4c1-4c31-4b86-bbc5-3ae754a4ea45_2096x314.png 424w, https://substackcdn.com/image/fetch/$s_!LxzA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21d1f4c1-4c31-4b86-bbc5-3ae754a4ea45_2096x314.png 848w, https://substackcdn.com/image/fetch/$s_!LxzA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21d1f4c1-4c31-4b86-bbc5-3ae754a4ea45_2096x314.png 1272w, https://substackcdn.com/image/fetch/$s_!LxzA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21d1f4c1-4c31-4b86-bbc5-3ae754a4ea45_2096x314.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p><strong>Extensible</strong>. Since there are hundreds of potential types of tests and metrics to support, additional NLP tasks of interest, and custom needs for many projects, much thought has been put into making it easy to implement and reuse new types of tests.</p><p>To support hundreds of potential types of tests and metrics, additional NLP tasks, and custom needs for many projects, the framework has been designed to be extensible, making it easy to implement and reuse new types of tests. For instance, the framework includes a built-in test type for bias in US English, which replaces first and last names with names that are common for White, Black, Asian, or Hispanic people. But what if your application is intended for India or Brazil, or if the testing needs to consider bias based on age or disability, or if a different metric is needed for when a test should pass?</p><p>The nlptest library makes it easy to write and then mix and match test types. The TestFactory class defines a standard API for different tests to be configured, generated, and executed. We&#8217;ve put in a lot of effort to ensure that the library can be easily tailored to meet your needs and that you can contribute or customize it with ease.</p><p><strong>Test Models and Data</strong>. A common issue when a model is not ready for production lies in the dataset used for training or evaluation, rather than the modeling architecture. A widely prevalent issue in commonly used datasets, as demonstrated by [<a href="https://arxiv.org/abs/2103.14749">Northcutt et. al. 2021</a>] is the mislabeling of training examples. Additionally, representation bias presents a challenge for assessing a model&#8217;s performance across ethnic lines, as there may not be enough test labels to calculate a usable metric. In such cases, it is appropriate for the library to fail a test and suggest changes to the training and test sets to better represent other groups, fix likely mistakes, or train for edge cases.</p><p>Therefore, a test scenario is defined by a task, a model, and a dataset, i.e.:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0Ec-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdb87695-bf17-4dd7-8737-eeb081093752_2108x238.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0Ec-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdb87695-bf17-4dd7-8737-eeb081093752_2108x238.png 424w, https://substackcdn.com/image/fetch/$s_!0Ec-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdb87695-bf17-4dd7-8737-eeb081093752_2108x238.png 848w, https://substackcdn.com/image/fetch/$s_!0Ec-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdb87695-bf17-4dd7-8737-eeb081093752_2108x238.png 1272w, https://substackcdn.com/image/fetch/$s_!0Ec-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdb87695-bf17-4dd7-8737-eeb081093752_2108x238.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0Ec-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdb87695-bf17-4dd7-8737-eeb081093752_2108x238.png" width="1456" height="164" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bdb87695-bf17-4dd7-8737-eeb081093752_2108x238.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:164,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:54622,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0Ec-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdb87695-bf17-4dd7-8737-eeb081093752_2108x238.png 424w, https://substackcdn.com/image/fetch/$s_!0Ec-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdb87695-bf17-4dd7-8737-eeb081093752_2108x238.png 848w, https://substackcdn.com/image/fetch/$s_!0Ec-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdb87695-bf17-4dd7-8737-eeb081093752_2108x238.png 1272w, https://substackcdn.com/image/fetch/$s_!0Ec-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdb87695-bf17-4dd7-8737-eeb081093752_2108x238.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>This setup not only allows the library to offer a complete testing strategy for both models and data but also enables you to use generated tests to augment your training and test datasets, which can considerably reduce the time required to fix models and prepare them for production.</p><p>The next sections describe that the nlptest library helps you automate three tasks: Generating tests, running tests, and augmenting data.</p><p></p><h2><strong>Getting Started</strong></h2><p>Ready to improve the safety, reliability, and accuracy of your NLP models? It&#8217;s time to get started with the John Snow Labs&#8217; <strong>nlptest</strong> library by visiting <a href="https://nlptest.org/">nlptest.org</a> and installing it with pip install nlptest. With its extensive support for different NLP libraries, extensible framework for creating custom tests, and ability to generate and run tests on both models and datasets, you can quickly identify issues and improve the accuracy of your models.</p><p>Join our open-source community project on <a href="https://github.com/johnSnowLabs/nlptest">GitHub,</a> share examples and documentation, and contribute to the development of the library.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.talby.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading AI in Healthcare! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p>]]></content:encoded></item><item><title><![CDATA[Free No-Code NLP: The Annotation Lab Becomes the NLP Lab]]></title><description><![CDATA[The Annotation Lab has evolved into the NLP Lab, a free no-code NLP platform with enterprise-grade features. It supports the full lifecycle of NLP model development and includes a Private Models Hub.]]></description><link>https://www.talby.com/p/free-no-code-nlp-the-annotation-lab</link><guid isPermaLink="false">https://www.talby.com/p/free-no-code-nlp-the-annotation-lab</guid><dc:creator><![CDATA[David Talby]]></dc:creator><pubDate>Tue, 10 Jan 2023 06:41:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8804d3f3-5d76-4243-b7a2-0eb88eeac0e7_1456x1048.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zLAG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8804d3f3-5d76-4243-b7a2-0eb88eeac0e7_1456x1048.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zLAG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8804d3f3-5d76-4243-b7a2-0eb88eeac0e7_1456x1048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!zLAG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8804d3f3-5d76-4243-b7a2-0eb88eeac0e7_1456x1048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!zLAG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8804d3f3-5d76-4243-b7a2-0eb88eeac0e7_1456x1048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!zLAG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8804d3f3-5d76-4243-b7a2-0eb88eeac0e7_1456x1048.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zLAG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8804d3f3-5d76-4243-b7a2-0eb88eeac0e7_1456x1048.jpeg" width="1456" height="1048" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8804d3f3-5d76-4243-b7a2-0eb88eeac0e7_1456x1048.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1048,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:3071968,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zLAG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8804d3f3-5d76-4243-b7a2-0eb88eeac0e7_1456x1048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!zLAG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8804d3f3-5d76-4243-b7a2-0eb88eeac0e7_1456x1048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!zLAG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8804d3f3-5d76-4243-b7a2-0eb88eeac0e7_1456x1048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!zLAG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8804d3f3-5d76-4243-b7a2-0eb88eeac0e7_1456x1048.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bs64!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F798cc7d8-23b2-4706-9003-423c582b30c4_1920x1011.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bs64!,w_424,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F798cc7d8-23b2-4706-9003-423c582b30c4_1920x1011.gif 424w, https://substackcdn.com/image/fetch/$s_!bs64!,w_848,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F798cc7d8-23b2-4706-9003-423c582b30c4_1920x1011.gif 848w, https://substackcdn.com/image/fetch/$s_!bs64!,w_1272,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F798cc7d8-23b2-4706-9003-423c582b30c4_1920x1011.gif 1272w, https://substackcdn.com/image/fetch/$s_!bs64!,w_1456,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F798cc7d8-23b2-4706-9003-423c582b30c4_1920x1011.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bs64!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F798cc7d8-23b2-4706-9003-423c582b30c4_1920x1011.gif" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/798cc7d8-23b2-4706-9003-423c582b30c4_1920x1011.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:9014359,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/gif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bs64!,w_424,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F798cc7d8-23b2-4706-9003-423c582b30c4_1920x1011.gif 424w, https://substackcdn.com/image/fetch/$s_!bs64!,w_848,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F798cc7d8-23b2-4706-9003-423c582b30c4_1920x1011.gif 848w, https://substackcdn.com/image/fetch/$s_!bs64!,w_1272,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F798cc7d8-23b2-4706-9003-423c582b30c4_1920x1011.gif 1272w, https://substackcdn.com/image/fetch/$s_!bs64!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F798cc7d8-23b2-4706-9003-423c582b30c4_1920x1011.gif 1456w" sizes="100vw"></picture><div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!K5hG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01addc13-90e0-44da-a50f-3715591887c0_1920x1011.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!K5hG!,w_424,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01addc13-90e0-44da-a50f-3715591887c0_1920x1011.gif 424w, https://substackcdn.com/image/fetch/$s_!K5hG!,w_848,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01addc13-90e0-44da-a50f-3715591887c0_1920x1011.gif 848w, https://substackcdn.com/image/fetch/$s_!K5hG!,w_1272,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01addc13-90e0-44da-a50f-3715591887c0_1920x1011.gif 1272w, https://substackcdn.com/image/fetch/$s_!K5hG!,w_1456,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01addc13-90e0-44da-a50f-3715591887c0_1920x1011.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!K5hG!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01addc13-90e0-44da-a50f-3715591887c0_1920x1011.gif" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/01addc13-90e0-44da-a50f-3715591887c0_1920x1011.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:9014359,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/gif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!K5hG!,w_424,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01addc13-90e0-44da-a50f-3715591887c0_1920x1011.gif 424w, https://substackcdn.com/image/fetch/$s_!K5hG!,w_848,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01addc13-90e0-44da-a50f-3715591887c0_1920x1011.gif 848w, https://substackcdn.com/image/fetch/$s_!K5hG!,w_1272,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01addc13-90e0-44da-a50f-3715591887c0_1920x1011.gif 1272w, https://substackcdn.com/image/fetch/$s_!K5hG!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F01addc13-90e0-44da-a50f-3715591887c0_1920x1011.gif 1456w" sizes="100vw"></picture><div></div></div></a></figure></div><p><a href="https://ccpq104.na1.hubspotlinks.com/Ctc/2H+113/ccpq104/VX5WJ68yrylNN5wdjRnF-N_VVDyhz94VF3HhN4jRWbX3lSbtV1-WJV7CgGVqW6HTl-17QfpPMVbj-hs3L48_nW8jZKKB2nmXz-MfXntwW-1Q1W4j60Rz8fzXx4W1Cnq6n6cfDYTW6gK5xT64TMG-W3KMTFP6YSw39W46wYDT60Rn7SW7zybJz1yzxN2W3J0Mc7616pFcW2bsh9t7VrCDRW8FP-kW52ShL9W6whqBT5NxgLfW5xHTPy7-2b4-W8qM49_40ntPbW50w10c292RpjW7lFrPd7dj9f-W51q8Rz9gP6v8W2ZwT_X4KV8743czk1">The Annotation Lab</a> was adopted by more than 100 organizations worldwide last year. The AI community took to the idea of a 100% free data labeling platform that includes all &#8220;enterprise&#8221; features, including:</p><ul><li><p>AI-Assisted Labeling</p></li><li><p>Team &amp; Project Management</p></li><li><p>Enterprise-Grade Security</p></li><li><p>Custom Workflows</p></li><li><p>Analytics</p></li><li><p>Scalability</p></li><li><p>Privacy</p></li><li><p>Versioned audit trails</p></li><li><p>All with no limits and support for both cloud and on-premise deployment.</p></li></ul><p>&nbsp;After two years of releasing new versions every two weeks, the lab has now evolved to become <a href="https://ccpq104.na1.hubspotlinks.com/Ctc/2H+113/ccpq104/VX5WJ68yrylNN5wdjRnF-N_VVDyhz94VF3HhN4jRWbX3lSbtV1-WJV7CgHWSN8ZqBTfZdjBQW8nhbVg5sMgQKW2KCx5j1ZwppjW95j2Wz3P6_3JV83_Dr15xm5lW23sWw07vh-xNW38gwmJ569ffsW5XR7sg8mJ3gSW8K2bs-82fDbMW4ZSPb37dPWV4TPVC-7hYCCBW49Ndv11DF0rWW70tX0w52DL-WN84CGdXFP6MkW62N1B7364mqXW4xkFsb6WzJKhMZ7v5CfFq4sN4cJFxcjXY1SW8fJ-JM9c-6y_V3Z9JD4HtwpB39Mv1">the NLP Lab: a free no-code NLP platform</a>. Doctors, pharmacists, lawyers, and financial analysts use it to train, tune, test, and publish NLP models, often without getting data scientists involved. The platform now supports the full lifecycle of creating a new project &amp; goals, starting from a pre-trained model, teaching it domain expertise (by combining labels, rules, and models), training or tuning a model, testing it, and publishing it as an API.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zrXg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6839ee9d-1e7b-48e4-9aba-f43e39f68a0f_3024x1596.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zrXg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6839ee9d-1e7b-48e4-9aba-f43e39f68a0f_3024x1596.png 424w, https://substackcdn.com/image/fetch/$s_!zrXg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6839ee9d-1e7b-48e4-9aba-f43e39f68a0f_3024x1596.png 848w, https://substackcdn.com/image/fetch/$s_!zrXg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6839ee9d-1e7b-48e4-9aba-f43e39f68a0f_3024x1596.png 1272w, https://substackcdn.com/image/fetch/$s_!zrXg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6839ee9d-1e7b-48e4-9aba-f43e39f68a0f_3024x1596.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zrXg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6839ee9d-1e7b-48e4-9aba-f43e39f68a0f_3024x1596.png" width="1456" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6839ee9d-1e7b-48e4-9aba-f43e39f68a0f_3024x1596.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1337168,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zrXg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6839ee9d-1e7b-48e4-9aba-f43e39f68a0f_3024x1596.png 424w, https://substackcdn.com/image/fetch/$s_!zrXg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6839ee9d-1e7b-48e4-9aba-f43e39f68a0f_3024x1596.png 848w, https://substackcdn.com/image/fetch/$s_!zrXg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6839ee9d-1e7b-48e4-9aba-f43e39f68a0f_3024x1596.png 1272w, https://substackcdn.com/image/fetch/$s_!zrXg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6839ee9d-1e7b-48e4-9aba-f43e39f68a0f_3024x1596.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The lab includes a <a href="https://ccpq104.na1.hubspotlinks.com/Ctc/2H+113/ccpq104/VX5WJ68yrylNN5wdjRnF-N_VVDyhz94VF3HhN4jRWcc3lSbNV1-WJV7CgX1LW2tzbvT1R8Kb4W5S1Wcm7JKm9NW2z70pZ8_8VRtW7ysCnh7BF1NWVJ4zhw5c1xMtW1yxV4D42ZL2cW5S7WX65RgSWMW6sQYk66qg4x4VDLVw83n8jM6W4XgY3s3SzFfzW4QGtpG2B686jVNPTDW4yJbRcW2p-y-q4t-BGJVYqBgj8jqMHZW15X3RG2MbWJLN1BmmsDXM70qMmP81xv26CMW770PsB9hF7YpW1ySZWT2pYBQYW4YVRPV6s7rMxW4L79Qj6dls-yN915V5nVfSgh3h6x1">Private Models Hub</a>, letting you search, filter, manage, and safely share custom models you&#8217;ve built. The hub&#8217;s new Playground lets you quickly test any model or rule on a snippet of text, without the need to create a project and import tasks.</p><p>The next major feature is prompt engineering - democratizing zero-shot learning by putting it in the hands of business domain experts. To learn more about this exciting new capability, watch the webinar on <a href="https://ccpq104.na1.hubspotlinks.com/Ctc/2H+113/ccpq104/VX5WJ68yrylNN5wdjRnF-N_VVDyhz94VF3HhN4jRWbk5knJ3V3Zsc37CgVHXW3c9TbD8nWJV_W7p52Pl94l_wTW12_-9D2PLFCbW6z3Rsx3xrykGW7_NW3n6076cPW5Jxp0-6wvNPXW8kKgHl96wm18W27k7S23MShjyW1vXfq510QyZWW7d67rj52LlssW61WNv_5kkhLjW7HTztC5x9QB5W4qnStB4bzrjmW94-B6Z7bPbnnW1ly_Mc3Hvks8W6KHy8Y9h3MMfW5HTljB130hBdW31Gczz1sy5zGVlY6JC7Y0jm4VBVrk29fgtGtVNGrgN1j-HmVW40xj_-324L5zW7zZXBt6xQCjGW6cJKJC9g2KxmW3yHY365DVDG6Vhr6sT6WDNrcVZybKk5TclX8W7L4NTz8njf_BW68mFWq4fsGHmW1_syWk7_hBt-W2LcY0J5jJG6QW3xvlpk6KKzq13hlg1">&#8220;Combining Prompt Engineering, Programmatic Labeling, and Model Tuning in the No-Code NLP Lab.&#8221;</a></p><p>The lab&#8217;s name change does not change our commitment to you: to keep it free, to keep improving it with frequent releases, and to keep making it the best platform for teams who build and deploy state-of-the-art NLP models. Ready to give the NLP Lab a go? Install it with a few clicks:</p><p>&#183;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <a href="https://aws.amazon.com/marketplace/pp/prodview-nsww5rdpvou4w">AWS Marketplace</a></p><p>&#183;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <a href="https://azuremarketplace.microsoft.com/pt-br/marketplace/apps/johnsnowlabsinc1646051154808.annotation_lab?tab=overview">Azure Marketplace</a></p><p>&#183;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <a href="https://www.johnsnowlabs.com/nlp-lab/">One-Liner Kubernetes Script</a></p><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://aiinhealthcare.substack.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share AI in Healthcare&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://aiinhealthcare.substack.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share AI in Healthcare</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Celebrating Our Free & Open-Source Contributions in 2022]]></title><description><![CDATA[As we enter 2023, John Snow Labs' focus is on Responsible AI, empowering domain experts, and giving you control over large language models.]]></description><link>https://www.talby.com/p/celebrating-john-snow-labs-free-and</link><guid isPermaLink="false">https://www.talby.com/p/celebrating-john-snow-labs-free-and</guid><dc:creator><![CDATA[David Talby]]></dc:creator><pubDate>Fri, 16 Dec 2022 06:45:00 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/fbf9a8a5-a2e9-4743-89e9-43d1aa9dd353_1456x1048.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!T4m1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc87a1941-ae3f-493a-8db0-91c678a0356e_1456x1048.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!T4m1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc87a1941-ae3f-493a-8db0-91c678a0356e_1456x1048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!T4m1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc87a1941-ae3f-493a-8db0-91c678a0356e_1456x1048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!T4m1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc87a1941-ae3f-493a-8db0-91c678a0356e_1456x1048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!T4m1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc87a1941-ae3f-493a-8db0-91c678a0356e_1456x1048.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!T4m1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc87a1941-ae3f-493a-8db0-91c678a0356e_1456x1048.jpeg" width="1456" height="1048" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c87a1941-ae3f-493a-8db0-91c678a0356e_1456x1048.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1048,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:3083116,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!T4m1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc87a1941-ae3f-493a-8db0-91c678a0356e_1456x1048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!T4m1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc87a1941-ae3f-493a-8db0-91c678a0356e_1456x1048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!T4m1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc87a1941-ae3f-493a-8db0-91c678a0356e_1456x1048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!T4m1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc87a1941-ae3f-493a-8db0-91c678a0356e_1456x1048.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>Thank you for supporting John Snow Labs in 2022. With the help of a global AI community, we&#8217;ve been able to grow our free &amp; open-source initiatives and have several milestones to celebrate together this holiday season:</p><ul><li><p><a href="https://ccpq104.na1.hubspotlinks.com/Ctc/2H+113/ccpq104/VWyGDP73dWMTV55B9l41PZZ6W6tdK5Q4TQ0HfN33Mk4X3q8_QV1-WJV7CgY_lW2dTmbK36h7dRW5T48xj8PGLQqW3W86-X1Gl4H-N8St4vKpp7s1W7Z59Tb501hvgW6HDhXZ5mk5-CW36ZN5h6qdRfpW1b64tr4wF6vgW328Pkd4V-q9cW8kynBD6DqZydW11gfPH2LDdBFW69RRml3mRbWFW5Z5TbB3-cCrpW68sPpK16zbShW6ZqZpR8Mv9LJW15kGtX6kTQC6W2Tc9Rj7v4F4SW8WkGjN9158bsVxtQlT83ylQjW3HfbN37lf9RbW6wc2TS5CKH5zW7ds36V91CB993bpM1">Spark NLP</a>&nbsp;crossed 5 years of releasing software every two weeks. It has over 45M downloads of the open-source library across 12 supported platforms.</p></li><li><p><a href="https://ccpq104.na1.hubspotlinks.com/Ctc/2H+113/ccpq104/VWyGDP73dWMTV55B9l41PZZ6W6tdK5Q4TQ0HfN33Mk5c3q905V1-WJV7CgFV5W7XwBrr25WCJKW389dRl4NFR4XW5ck-Fp90prbQW1Vp2q15-1l--VBqNxt4DQBWpW78bmrq3YpPCJMGxbyGCK4_cW5648Q64fblWlW1VL6vc8BXRlqW5y8Nr361wZ1mW8n8cRy1BcLRnN38r5vc1WgdRW3mLvW58fZTFfW7YFH8X5kqnkVW4nG9xs7RnnX9VW8Wc-1mSzWKW7Rf0S89h4388W3f29Sv6SxR8dW8_23pF33S0Z-F6BzC9N5xvDW7tF5g_9gKcclW6D7QKR7wl3PBW8vcXtM1jrXt6W5nZhS76flxmy31BP1">The Annotation Lab</a>&nbsp;is now actively used by more than 100 organizations worldwide to label, train, and test NLP models without coding.</p></li><li><p><a href="https://ccpq104.na1.hubspotlinks.com/Ctc/2H+113/ccpq104/VWyGDP73dWMTV55B9l41PZZ6W6tdK5Q4TQ0HfN33Mk5w3q90pV1-WJV7CgPc8W7M3L0h50YCcDW82g4xx4YJ86cW18rnyP2PL_w5N6lCDgXHn2dlW3sPXk34CmwjpW81sLMk53mG5bN6Zp8zf3bqFrW8Zc2ZT3H9pg6W5LVMLT9dGqg5M2BfG03gv37W1-_1TN6prKXTW62J2wn6sQcxLVFl_Wy5RrW4LW5j2Nsg8pGLVVN1xqY0YV2P6pW8KYPcl3LtSpqW8RFCGz3CRh_LW5x2yBx5_chs7W8p4k0Z9jGNRwW59BjJY7QHphHV_16L77Zndh6N2YVSHlSmFhMW320TLz1SQvpjW7h5Jnm1BbtjsW8ss1Ry54Y_crW2CsFC26jBnd-3n_n1">Spark NLP for Academia</a>&nbsp;provided almost 300 free academic licenses to accelerate research into Healthcare NLP to the universities of Harvard, Cambridge, Stanford, Oxford, MIT, Duke, Toronto, Georgetown, Madrid, Chicago, Milan, Athens, and many others.</p></li><li><p><a href="https://ccpq104.na1.hubspotlinks.com/Ctc/2H+113/ccpq104/VWyGDP73dWMTV55B9l41PZZ6W6tdK5Q4TQ0HfN33Mk4X3q8_QV1-WJV7CgRjCD3TD52WL5MN5CFkv0tKhS-W2JSx6R34_Pb7W3ql5_25_GBZxW164kZm8h2hhJW5062Zn3wth7SW7Mbw8r2sSvt-W85rzpn3_wwyGW3plcp61lm_z0W3KG5Pz3l30H-W53VkD45GCLfdN85QYWx9b191W4P7b1y7zGVKlVw1Gd_8bMtRRW8_g9gK2whSy6W3WpFgS9448QKW8Fn1Cr4Cz04vW5PFWQF1F7Zl1W7hgXv98xgb-wW1VYS8N7RPx3nW1Jp5CT8rhx8XW6LxQfm27RbFS3pVG1">The NLP Summit</a>&nbsp;once again drew more than 10,000 people this year, as the largest gathering of applied NLP professionals worldwide. Other educational initiatives include the Healthcare NLP Summit, monthly webinars, talks at over 20 conferences, and over 500 free student passes to Spark NLP training &amp; certification.</p></li><li><p><a href="https://ccpq104.na1.hubspotlinks.com/Ctc/2H+113/ccpq104/VWyGDP73dWMTV55B9l41PZZ6W6tdK5Q4TQ0HfN33Mk5c3q905V1-WJV7CgHDjW5n6G29529sz0W6pqG3S8x05nkN8cmC3GqrLfdW3nwNVQ6Z_T8JW28Khw53CFd-FW1zLDXY4gKf7yW5jJnDf5sdCk8W8swkKs4CYnWtN14X4Zs3lbs9W3QzCmp5SjxDlW6rCVVs1z6JdMW5rCCTg2pqD81W3KdgBt5Xj5v6W2rcXX289L9CbW8nvPlV85nbGsN3ksLrjlY5Y2W57Pw1C2B2klqVzf8Vf5hqb3hW5F748m2vFDwSN7H461BzBGCjW2L4ljY8qyhnmW6-NcVB5XGxKgW947mlv6mQKNmN5sVTS_sY9pT37301">Peer-Reviewed Papers</a>. The 9 papers we co-published this year established new state-of-the-art accuracy in clinical relation extraction, adverse event detection, and biomedical named entity recognition. The team also contributed to medical research papers on COVID-19 outcome prediction and risk factor detection.</p></li></ul><p>In our commercial work, we&#8217;ve significantly expanded the Healthcare NLP library which now has 800+ state-of-the-art pre-trained models including industry-first models specific to oncology, radiology, public health, and social determinants of health. We&#8217;ve also launched Finance NLP (100+ models &amp; pipelines), Legal NLP (500+), and Visual NLP (expanding Spark OCR with Document AI capabilities). Ease of use is greatly improved with one-click deployment on AWS, Azure, and Databricks, and a new Python library with simplified installation and access to thousands of models with one line of code.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://aiinhealthcare.substack.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share AI in Healthcare&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://aiinhealthcare.substack.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share AI in Healthcare</span></a></p><p></p><p>&nbsp;We&#8217;ve grown our revenue and customer base &#8211; which enables us to keep you at the state of art of NLP &amp; AI for years to come, whether you use our paid products or free software. This primarily means providing you with production-grade versions of the most accurate models ever, tuned for your needs. But going into 2023, it also means other things. We need to make Responsible AI a reality and raise the bar of what a production-ready model means. We need to enable doctors &amp; lawyers instead of data scientists to train AI models. And we need to make the new class of large language models work for you under your control. As always, we&#8217;d appreciate your help by telling us what to build for you, trying out the new software, or contributing to the open-source codebase.</p><div class="pullquote"><p>&nbsp;With Thanks, Happy Holidays, and a Happy New Year!</p></div>]]></content:encoded></item><item><title><![CDATA[Natural Language Processing with Free Low-Code & No-Code Tools]]></title><description><![CDATA[Global Artificial Intelligence Conference, March 2022.]]></description><link>https://www.talby.com/p/natural-language-processing-with</link><guid isPermaLink="false">https://www.talby.com/p/natural-language-processing-with</guid><dc:creator><![CDATA[David Talby]]></dc:creator><pubDate>Sun, 06 Mar 2022 16:39:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!tcYj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3207f8b-18b0-4e91-ae6d-72c95de5e808_1456x1048.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tcYj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3207f8b-18b0-4e91-ae6d-72c95de5e808_1456x1048.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tcYj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3207f8b-18b0-4e91-ae6d-72c95de5e808_1456x1048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!tcYj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3207f8b-18b0-4e91-ae6d-72c95de5e808_1456x1048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!tcYj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3207f8b-18b0-4e91-ae6d-72c95de5e808_1456x1048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!tcYj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3207f8b-18b0-4e91-ae6d-72c95de5e808_1456x1048.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tcYj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3207f8b-18b0-4e91-ae6d-72c95de5e808_1456x1048.jpeg" width="1456" height="1048" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a3207f8b-18b0-4e91-ae6d-72c95de5e808_1456x1048.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1048,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:137805,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!tcYj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3207f8b-18b0-4e91-ae6d-72c95de5e808_1456x1048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!tcYj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3207f8b-18b0-4e91-ae6d-72c95de5e808_1456x1048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!tcYj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3207f8b-18b0-4e91-ae6d-72c95de5e808_1456x1048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!tcYj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa3207f8b-18b0-4e91-ae6d-72c95de5e808_1456x1048.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>Today, applying state-of-the-art natural language processing (NLP) to extract information from documents and images requires data scientists who specialize in current deep learning and transfer learning techniques. However, new tools are automating a lot of the model selection, hyperparameter tuning, and pipeline optimization work to enable business domain experts &#8211; lawyers, doctors, financial analysts &#8211; to train &amp; tune such models on their own. This session introduces the Annotation Lab: a free, enterprise-grade, privacy-focused tool that enables teams of domain experts to build highly accurate models that extract information from free-text and visual documents.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://aiinhealthcare.substack.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share AI in Healthcare&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://aiinhealthcare.substack.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share AI in Healthcare</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Real-World Limits to the Accuracy of Medical Data]]></title><description><![CDATA[Recent research found errors in widely used datasets, especially in healthcare. NLP technology helps, but limitations exist. Trust in AI requires reproducibility and recognition of its limitations.]]></description><link>https://www.talby.com/p/real-world-limits-to-the-accuracy</link><guid isPermaLink="false">https://www.talby.com/p/real-world-limits-to-the-accuracy</guid><dc:creator><![CDATA[David Talby]]></dc:creator><pubDate>Wed, 16 Feb 2022 06:25:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Bpxk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28310e4d-2d0a-468e-b91f-72652b8f1107_1456x1048.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Bpxk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28310e4d-2d0a-468e-b91f-72652b8f1107_1456x1048.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Bpxk!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28310e4d-2d0a-468e-b91f-72652b8f1107_1456x1048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Bpxk!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28310e4d-2d0a-468e-b91f-72652b8f1107_1456x1048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Bpxk!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28310e4d-2d0a-468e-b91f-72652b8f1107_1456x1048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Bpxk!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28310e4d-2d0a-468e-b91f-72652b8f1107_1456x1048.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Bpxk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28310e4d-2d0a-468e-b91f-72652b8f1107_1456x1048.jpeg" width="1456" height="1048" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/28310e4d-2d0a-468e-b91f-72652b8f1107_1456x1048.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1048,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:91050,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Bpxk!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28310e4d-2d0a-468e-b91f-72652b8f1107_1456x1048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Bpxk!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28310e4d-2d0a-468e-b91f-72652b8f1107_1456x1048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Bpxk!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28310e4d-2d0a-468e-b91f-72652b8f1107_1456x1048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Bpxk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28310e4d-2d0a-468e-b91f-72652b8f1107_1456x1048.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>Algorithms are only as good as the quality of data they&#8217;re being fed. This is not a new concept, but as we begin to rely more heavily on data-driven technologies, such as artificial intelligence (AI) and other automation tools and applications, it&#8217;s becoming a more important one.&nbsp;</p><p>Recent&nbsp;<a href="https://arxiv.org/pdf/2103.14749.pdf">research</a>&nbsp;from MIT found a high number of errors in publicly available datasets that are widely used for training models. An average of 3.3% errors were found in the test sets of 10 of the most widely used computer vision, natural language processing (NLP), and audio datasets.&nbsp;</p><p>Given those accuracy baselines are often at or above 90%, this means that a lot of research innovation amounts to chance &#8212; or overfitting to errors. Data science practitioners should exercise caution when choosing which models to deploy based on small accuracy gains on such datasets.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.talby.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.talby.com/subscribe?"><span>Subscribe now</span></a></p><p></p><p>These findings are particularly concerning regarding AI applications in high-stakes industries like healthcare. Outcomes in this field have the ability to prevent disease, accelerate the development of life-saving medicine and help us understand the spread of disease and other critical health trends. While accuracy in healthcare is vital to success, it&#8217;s also rife with complexities that make this extremely challenging.&nbsp;</p><p>One of the reasons for this is the data source. More than half of the clinically relevant data for applications like recommending a course of treatment, finding actionable genomic biomarkers, or matching patients to clinical trials is only found in free text. This includes physicians&#8217; notes, diagnostic imaging, pathology reports, lab reports, and other sources not available as structured data within electronic health records (EHR). These information sources include nuances and data quality issues that make it hard to connect the dots and get a full picture of a patient.</p><p></p><p>Another barrier exists in the limitations of what's in the data itself. Because there are no shared standards for data collection across hospitals and healthcare systems, inconsistencies and inaccuracies are common. Between different organizations collecting different information and records not being updated on a consistent basis, it&#8217;s difficult to know how accurate the data is &#8212; especially if it&#8217;s being moved and updated among different providers.&nbsp;</p><p>It&#8217;s not just providers to blame, either &#8212; inaccuracies come directly from the patients themselves. A&nbsp;<a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4441665/">recent study</a>&nbsp;from The Journal of General Internal Medicine shows just how prevalent this can be. When exploring the accuracy of race, ethnicity and language preference in EHRs, the study found that 30% of whites self-reported identification with at least one other racial or ethnic group, as did 37% of Hispanics and 41% of African Americans. Patients were also less likely to complete the survey in Spanish than the language preference noted in the EHR would have suggested.&nbsp;</p><p>There&#8217;s clearly a need for better data collection practices in healthcare and beyond. Accurate information can help the medical community understand more about social determinants of health, patient risk prediction, clinical trial matching, and more. Standardizing how this data is collected and recorded can ensure the clean data gets shared and analyzed correctly. This is both a medical and social challenge. For example, what is the &#8220;correct&#8221; race to fill in? When exactly is someone considered a smoker? This is also partly a technology challenge, as we&#8217;re already way beyond the limit of what&#8217;s reasonable to ask providers and patients to manually input.&nbsp;&nbsp;</p><p>There are also data quality issues outside our direct control, such as fraud and abuse. The National Health Care Anti-Fraud Association&nbsp;<a href="https://www.bcbsm.com/health-care-fraud/fraud-statistics.html">estimates</a>&nbsp;that "healthcare fraud costs the nation about $68 billion annually &#8212; about 3% of the nation's $2.26 trillion in healthcare spending. Other estimates range as high as 10% of annual healthcare expenditure, or $230 billion." While we can account for error rates within the data, it&#8217;s an imperfect science at the end of the day, and it&#8217;s important to understand its limitations.&nbsp;</p><p>That said, it&#8217;s not all doom and gloom when it comes to quality data or the algorithms we use. Technology that can automatically understand the nuances of unstructured text and images, as well as reconcile conflicting and missing data points, is gradually maturing. NLP, for example, can address many pitfalls of data quality, such as uncovering disparities in an EHR versus a doctor&#8217;s transcript or what a patient self-reports. In recent years, newer algorithms and models can apply the context, medium, and intent of each data source to infer useful semantic answers.</p><p>This is especially useful when you consider how specific clinical language is. Take how we indicate triple-negative breast cancer (TNBC), for instance. While the acronym TNCB isn&#8217;t hard to identify, the condition can also be denoted as Er-/pr-/h2-, (er pr her2) negative, tested negative for the following: er, pr, h2 and triple-negative neoplasm of the upper left breast, to name a few. NLP can identify variations of these terms when they are in context &#8212; and healthcare-specific deep learning models have gotten very good at this.&nbsp;</p><p>Current state-of-the-art, peer-reviewed, publicly reproducible accuracy benchmarks on both competitive academic benchmarks and real-world production deployments have been steadily improving over the last five years. Libraries like Spark NLP surpass 90% accuracy on a variety of clinical and biomedical text understanding tasks. Reproducibility of results, consistency of applying clinical guidelines at scale, and the ability to easily tune models to a specific clinical use case or setting are three keys to successful implementations and to building broader trust in AI technology.</p><p>Healthcare is a complex field, and so, too, is its data. When using data to make any decision in this field, technology that helps will keep improving. But it&#8217;s critical to remember the fundamental limitations of data quality and accuracy that power these algorithms. Simply put, it&#8217;s unsafe to assume that a piece of data is correct because someone typed it into a computer.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://aiinhealthcare.substack.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share AI in Healthcare&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://aiinhealthcare.substack.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share AI in Healthcare</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Celebrating Our Healthcare AI & Open-Source Contributions in 2021]]></title><description><![CDATA[John Snow Lab Highlights Customer growth, increased downloads, regular software releases. Annotation Lab and NLP Server are now free. NLP Summit is a success. We'll be carbon-neutral starting Jan 1st.]]></description><link>https://www.talby.com/p/celebrating-john-snow-labs-healthcare</link><guid isPermaLink="false">https://www.talby.com/p/celebrating-john-snow-labs-healthcare</guid><dc:creator><![CDATA[David Talby]]></dc:creator><pubDate>Thu, 16 Dec 2021 06:49:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!toGC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F630b82ea-1d3f-43e8-b4f8-f13070d8f1a3_1456x1048.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!toGC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F630b82ea-1d3f-43e8-b4f8-f13070d8f1a3_1456x1048.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!toGC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F630b82ea-1d3f-43e8-b4f8-f13070d8f1a3_1456x1048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!toGC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F630b82ea-1d3f-43e8-b4f8-f13070d8f1a3_1456x1048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!toGC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F630b82ea-1d3f-43e8-b4f8-f13070d8f1a3_1456x1048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!toGC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F630b82ea-1d3f-43e8-b4f8-f13070d8f1a3_1456x1048.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!toGC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F630b82ea-1d3f-43e8-b4f8-f13070d8f1a3_1456x1048.jpeg" width="1456" height="1048" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/630b82ea-1d3f-43e8-b4f8-f13070d8f1a3_1456x1048.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1048,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:140564,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!toGC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F630b82ea-1d3f-43e8-b4f8-f13070d8f1a3_1456x1048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!toGC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F630b82ea-1d3f-43e8-b4f8-f13070d8f1a3_1456x1048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!toGC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F630b82ea-1d3f-43e8-b4f8-f13070d8f1a3_1456x1048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!toGC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F630b82ea-1d3f-43e8-b4f8-f13070d8f1a3_1456x1048.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>With the support of the global AI community &#8211; that is, thanks to your help &#8211; we&#8217;ve been able to continue to develop the most accurate, scalable and robust NLP solution for the healthcare industry and global AI community. This year:</p><ul><li><p><strong><a href="https://ccpq104.na1.hubspotlinks.com/Btc/2H+113/ccpq104/VWcvTS1hjl22W6xxGkK4CY0DsW861hN74C7YP1N6pj1VV5nKv5V3Zsc37Cg-ycW5-7Kh41krQJtW7V5jkh2St5SZW4ypB542NWGpRN2NLZhh4LJPNN6Bcwjf-CCSzW5DwG1k2J6L48W54Gbx6820FxbW2Y1RMz25y44JN1FJ38rD7RsjN5mvPLNDS5dSV9cLCc4rwY3dW7SzcfZ2gdWncW5sgyg13K7kfsW318SXt8znvrSVMbFv36j8DDcW7fQvHL7VSPc3W55lg6r4YCCS1W8Qt1lQ3NpmtNV1Wq8g7bpZFkW33JZc93j14WXW1Y3FpP2WMt-sW34Wcs0839_d-N3CgS3NKPFghW8PD8LD3GvcgGW8Y2DH_3zRw57N7BXGSw3PbDSW4Y2Bv51flC_zW5_P7sk3BJHVcW5k2bM01Ghm9vW1x95fq718M-DW9lq7l73WV828N8L2wS2Knbdl3nYg1">John Snow Labs' customer base grew by 5x</a></strong> &#8212; the revenue from which we&#8217;re reinvesting to fuel faster innovation.</p></li><li><p><strong><a href="https://ccpq104.na1.hubspotlinks.com/Btc/2H+113/ccpq104/VWcvTS1hjl22W6xxGkK4CY0DsW861hN74C7YP1N6pj1X33q905V1-WJV7CgWrRW78rjYj3TQWMxW6ZD5JR6pQQNPVlPZJm93JPxdW2hmvFB1F4VRTW1HsPK98z5Z8lN5Zwj3nbWpv7W5B4Nlb3D9qQ3W5mjd-K4RZt9fW5GlXdT2zYM6jF8B5W4Bsrw-W3gxh80401g6SW6l_xf1189nbNW83th5Y3d6fkyW1tvhkg5J5Y7XW7x5r9q4-LRP7W64Nyzw7xWl8PN8w-MQhWxSS8W7LHW8J4XM-LLW1LYhBP2DLGQlW5bcnYt6Jv0K9N2hfTS1ttZ4kW3LTPjd4sBLm4W8bzD3l2lkXJ0W17tkPH3ylPVP3lP61">Spark NLP</a></strong> downloads have grown 4x. We moved from celebrating one million downloads <em>ever</em> in 2020 to celebrating one million downloads <em>per month</em> in 2021.</p></li><li><p>We&#8217;ve kept our promise to keep you at the state of the art in the fast changing NLP world by releasing new software every two weeks - now for 4 years in a row. Check out how much was added and improved in <strong><a href="https://ccpq104.na1.hubspotlinks.com/Btc/2H+113/ccpq104/VWcvTS1hjl22W6xxGkK4CY0DsW861hN74C7YP1N6pj1XG3q90JV1-WJV7CgRzLW1cbS3429cDKgVkMF6Y582Q0BN1FVjX2HWMJHW5kyQP04r1BrnW7q1g2l7mZSGQW2LBSjb1hLF3GW12s5W-50Vx9-VXLVzM6M9SBzW7FM4-18sS56VW9kJxTD97LPPjW3d7dzC4_JjgzW5xSjWK13qcwhV76f722N9MdWW5pPMkv5YRp2QVdYsT_5LcVgQW6z05Fn7lfydZW8-Q1_Y4cLdgJW6HNc6G4Fv8lzW496M3Q1MpVfnW84gYxQ26VnxXW69r-j-7NRMx8W9c5RvM7FnPj-W1QgT8S8LK1nZW82SgQ34KZykZW70rDKq1JjDlSVJ1Wlm4-YmdkW84-Rw92qsJHkN7rWmgQnjq9_35-81">Healthcare NLP</a></strong>.</p></li><li><p>The <strong><a href="https://ccpq104.na1.hubspotlinks.com/Btc/2H+113/ccpq104/VWcvTS1hjl22W6xxGkK4CY0DsW861hN74C7YP1N6pj1Xm3q90pV1-WJV7CgGhHW4scfwW5lNk5cN4LQcVW89jDRW3113Tw8tn-5qW7qTJV_7My3YLW28gnBV6lRGptN2JVyxGQSs_qVQpsMX6tG2sBVf7GQZ7qKZx7W6Qshfn7pJwkyW1D5ccB1c8DkBW4WZwFZ4M2dzXN8RStQPtvbqtW8DW_0w1q0kjkW6lzsZX8RRDnqW53Q2Fy4C3fRyW124qZq8JNRpTW7G_gQR5RZCnZW7wf3hM96B6gzW4vv_JX90Hy5RW2r2Fqv5zFFTMW7cdBN063Y_CyW3mmR-v8JW0y3W473m4f3mnPn4Vk8CFQ14ZBNzW5DRqLJ2vkQYfW5L39601W_l5w390K1">Annotation Lab</a></strong> and <strong><a href="https://ccpq104.na1.hubspotlinks.com/Btc/2H+113/ccpq104/VWcvTS1hjl22W6xxGkK4CY0DsW861hN74C7YP1N6pj1Xm3q90pV1-WJV7CgTzvW1r3KCs2XnNmRW5jWh6p5FYS2gW91cZfz5yz8s4W8-jzQn9kByLmW4mqRnx6Lzd-pN57MyCHW9MhJW1Lnmbh4Gjn-_W4yTbBH5NcJFGW15hvKJ17pLzGW5FbMcN6wyhn4W96CZ-m7ShrJmW6XshCc8NtX2BVd8BnH8nHclbW8FSclP79h6TWW6znFhR66XBDGN9fQXnHHlGN5W7S73Mk8P9Vy1W2NW2l74nnKJnW5brDrf4RBndpW6_R-926fxw-bN2zq2_VpHc-dW7JPD529ljTLHN1rQXL-sWr2_W7ZZ9cm3RTV_WW3Pjhf-6-6b5HW41d7kg35pBk93hrc1">NLP Server</a></strong> have been made free forever, to help accelerate the global NLP community - including all enterprise-grade, high-compliance, and no-code features.</p></li><li><p>The <strong><a href="https://ccpq104.na1.hubspotlinks.com/Btc/2H+113/ccpq104/VWcvTS1hjl22W6xxGkK4CY0DsW861hN74C7YP1N6pj1WN3q8_QV1-WJV7CgD3VW7rR4vW7tW51xW85pjqj7SlB_5W215MPx2d2ZxfW14-40f4LhhzdW53z7nd9dpDQkW3K5_jJ8kqR4zW5dsFH927cFpKW5fxJd18SNrXJW12hrJr80MGrvW7McK1Y2M8BnRW7FbY9x2NT_cPW42_dbS92LVqGN1Kgm87xWsjxW42gBkL5qkZv-W2mRYzq2MVbdCW37cwRx6_QrmhW8gcqGS4JSsKtN3RCS4cZ9J2yW58Zg1y2FWGj0W5ggGp_31Qg8WW5DzWQw30BYRSW51qw6z2nGBNK3bMc1">NLP Summit</a></strong> grew to be one of the larger industry AI conferences with over 10,000 attendees enjoying 50+ technical sessions &amp; keynotes. The Healthcare NLP Summit 2022 is up next on April 5th &amp; 6th.</p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.talby.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.talby.com/subscribe?"><span>Subscribe now</span></a></p><p></p><p>None of this would be possible without your support, feedback, and referrals. We're humbled by your continuing trust in us and are impressed and inspired by the way you put our software to good use.</p><p>As a small token of appreciation, starting January 1<sup>st</sup> John Snow Labs will become a certified 100% carbon neutral company &#8212; just doing our bit to take responsibility, and prove that AI can be built and scaled sustainably.</p><p>Thanks again, for everything.</p><p>Wishing you and your loved ones a safe and joyful holiday season.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://aiinhealthcare.substack.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share AI in Healthcare&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://aiinhealthcare.substack.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share AI in Healthcare</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Operating AI Models Safely in Production]]></title><description><![CDATA[Deploying and maintaining natural language processing (NLP) models in production comes with its challenges, especially in ensuring model accuracy over time in real-world environments.]]></description><link>https://www.talby.com/p/operating-ai-models-safely-in-production</link><guid isPermaLink="false">https://www.talby.com/p/operating-ai-models-safely-in-production</guid><dc:creator><![CDATA[David Talby]]></dc:creator><pubDate>Wed, 17 Nov 2021 06:53:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!1m2W!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ea39ef0-a65c-468a-9c82-4ea63c3b40fe_1456x1048.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1m2W!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ea39ef0-a65c-468a-9c82-4ea63c3b40fe_1456x1048.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1m2W!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ea39ef0-a65c-468a-9c82-4ea63c3b40fe_1456x1048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!1m2W!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ea39ef0-a65c-468a-9c82-4ea63c3b40fe_1456x1048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!1m2W!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ea39ef0-a65c-468a-9c82-4ea63c3b40fe_1456x1048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!1m2W!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ea39ef0-a65c-468a-9c82-4ea63c3b40fe_1456x1048.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1m2W!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ea39ef0-a65c-468a-9c82-4ea63c3b40fe_1456x1048.jpeg" width="1456" height="1048" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1ea39ef0-a65c-468a-9c82-4ea63c3b40fe_1456x1048.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1048,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:117042,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1m2W!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ea39ef0-a65c-468a-9c82-4ea63c3b40fe_1456x1048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!1m2W!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ea39ef0-a65c-468a-9c82-4ea63c3b40fe_1456x1048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!1m2W!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ea39ef0-a65c-468a-9c82-4ea63c3b40fe_1456x1048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!1m2W!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ea39ef0-a65c-468a-9c82-4ea63c3b40fe_1456x1048.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>Getting natural language processing (NLP) models into production is a lot like buying a car. In both cases, you set your parameters for your desired outcome, test several approaches, likely retest them, and the minute you drive off the lot, value starts to plummet. Like having a car, having NLP or AI-enabled products has many benefits, but the maintenance never stops &#8212; at least to function properly over time, it shouldn&#8217;t.</p><p>While productionizing AI is hard enough, ensuring the accuracy of models down the line in a real-world environment can present even bigger governance challenges. Model accuracy degrades the moment it hits the market, as the predictable research environment it was trained on behaves differently in real life. Just as the highway is a different scenario than the lot at the dealership.</p><p>It&#8217;s called concept drift &#8212; meaning when variables change, the learned concept may no longer be precise &#8212; and while it&#8217;s nothing new in the field of AI and machine learning (ML), it&#8217;s something that continues to challenge users. It&#8217;s also a contributing factor as to why, despite huge investments in AI and NLP in recent years, only around 13% of data science projects actually make it into production (<em><strong><a href="https://venturebeat.com/2019/07/19/why-do-87-of-data-science-projects-never-make-it-into-production/">VentureBeat</a></strong></em>).</p><p>So what does it take to move products safely from research to production? Arguably just as important, what does it take to keep them in production accurately with the changing tides? There are a few considerations that enterprises should keep in mind to make sure their AI investments actually see the light of day.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.talby.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.talby.com/subscribe?"><span>Subscribe now</span></a></p><p></p><h2><strong>Getting AI models into production</strong></h2><p>Model governance is a key component in productionizing NLP initiatives and a common reason so many products remain projects. Model governance covers how a company tracks activity, access, and behavior of models in a given production environment. It&#8217;s important to monitor this to mitigate risk, troubleshoot, and maintain compliance. This concept is well understood among the AI global community, but it&#8217;s also a thorn in their side.&nbsp;</p><p>Data from the <strong><a href="https://gradientflow.com/2021nlpsurvey/">2021 NLP Industry Survey</a></strong> showed that high-accuracy tools that are easy to tune and customize were a top priority among respondents. Tech leaders echoed this, noting that accuracy, followed by production readiness, and scalability, was vital when evaluating NLP solutions. Constant tuning is key to models performing accurately over time, but it&#8217;s also the biggest challenge practitioners face.</p><p>NLP projects involve pipelines, in which the results from a previous task and pre-trained model are used downstream. Often, models need to be tuned and customized for their specific domains and applications. For example, a healthcare model trained on academic papers or medical journals will not perform the same when used by a media company to identify fake news.</p><p>Better searchability and collaboration among the AI community will play a key role in standardizing model governance practices. This includes storing modeling assets in a searchable catalog, including notebooks, datasets, resulting measurements, hyper-parameters, and other metadata. Enabling reproducibility and sharing of experiments across data science team members is another area that will be advantageous to those trying to get their projects to production-grade.</p><p>More tactically, rigorous testing and retesting is the best way to ensure models behave the same in production as they do in research &#8212; two very different environments. Versioning models that have advanced beyond an experiment to a release candidate, testing those candidates for accuracy, bias, and stability, and validating models before launching in new geographies or populations are factors that all practitioners should be exercising.</p><p>With any software launch, security and compliance should be baked into the strategy from the start, and AI projects are no different. Role-based access control and an approval workflow for model release and storing and providing all metadata needed for a full audit trail are some of the security measures necessary for a model to be considered production-ready.</p><p>These practices can significantly improve the chances of AI projects moving from ideation to production. More importantly, they help set the foundation for practices that should be applied once a product is customer-ready.</p><p></p><h2><strong>Keeping AI models in production</strong></h2><p>Back to the car analogy: There&#8217;s no definitive &#8220;check engine&#8221; light for AI in production, so data teams need to be constantly monitoring their models. Unlike traditional software projects, it&#8217;s important to keep data scientists and engineers on the project, even after the model is deployed.</p><p>From an operational standpoint, this requires more resources, both human capital and cost-wise, which may be why so many organizations fail to do this. The pressure to keep up with the pace of business and move onto the &#8216;next thing&#8217; also factors in, but perhaps the biggest oversight is that even IT leaders don&#8217;t expect model degradation to be a problem.</p><p>In healthcare, for example, a model can analyze electronic medical records (EMRs) to predict a patient&#8217;s likelihood of having an emergency C-Section based upon risk factors such as obesity, smoking or drug use, and other determinants of health. If the patient is dubbed high-risk, their practitioner may ask them to come in earlier or more frequently to reduce pregnancy complications.</p><p>The expectation is that these risk factors remain constant over time, and while many of them do, the patient is less predictable. Did they quit smoking? Were they diagnosed with gestational diabetes? There are also nuances in the way the clinician asks a question and records the answer in the hospital record that could result in different outcomes.</p><p>This can become even more tricky when you consider the NLP tools most practitioners are using. A majority (83%) of respondents from the aforementioned survey stated that they used at least one of the following NLP cloud services: AWS Comprehend, Azure Text Analytics, Google Cloud Natural Language AI, or IBM Watson NLU. While the popularity and accessibility of cloud services is obvious, tech leaders cited difficulty in tuning models and cost as major challenges. Essentially, even experts are grappling with maintaining the accuracy of models in production.</p><p>Another problem is that it simply takes time to see when something&#8217;s amiss. How long that is can vary significantly. Amazon may be updating an algorithm for fraud detection and mistakenly blocks customers in the process. Within hours, maybe even minutes, customer service emails will point to an issue. In healthcare, it can take months to get enough data on a certain condition to see that a model has degraded.</p><p>Essentially, to keep models accurate you need to apply the same rigor of testing, automating retrain pipelines, and measurement that was conducted before the model was deployed. When dealing with AI and ML models in production, It&#8217;s more pertinent to expect problems than it is to expect optimal performance several months out.</p><p>When you consider all the work it takes to get models into production and keep them there safely, it&#8217;s understandable why 87% of data projects never make it to market. Despite this, 93% of tech leaders indicated that their NLP budgets grew by 10-30% compared to last year (<strong><a href="https://gradientflow.com/2021nlpsurvey/">Gradient Flow</a></strong>). It&#8217;s encouraging to see growing investments in NLP technology, but it&#8217;s all for naught if businesses don&#8217;t take stock in the expertise, time, and continual updating required to deploy successful NLP projects.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://aiinhealthcare.substack.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share AI in Healthcare&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://aiinhealthcare.substack.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share AI in Healthcare</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[How to strike the balance between privacy and personalization in healthcare and beyond]]></title><description><![CDATA[Striking the balance between privacy and personalization is not easy, but applications in healthcare have proven that it&#8217;s possible and can even improve outcomes.]]></description><link>https://www.talby.com/p/how-to-strike-the-balance-between</link><guid isPermaLink="false">https://www.talby.com/p/how-to-strike-the-balance-between</guid><dc:creator><![CDATA[David Talby]]></dc:creator><pubDate>Fri, 16 Jul 2021 15:11:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!F-ja!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6f8f82a-df06-4cc6-b8e1-088d63a067cf_1456x1048.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>The trade-off between widespread technology adoption and responsible use often lies on the spectrum of privacy. When it comes to technologies fueled by data, such as artificial intelligence (AI), it&#8217;s even harder to strike the balance between equitable access and inherent risk. This is felt heavily in the healthcare industry, as regulations around information sharing are generally more stringent than those for other verticals.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!F-ja!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6f8f82a-df06-4cc6-b8e1-088d63a067cf_1456x1048.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!F-ja!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6f8f82a-df06-4cc6-b8e1-088d63a067cf_1456x1048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!F-ja!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6f8f82a-df06-4cc6-b8e1-088d63a067cf_1456x1048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!F-ja!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6f8f82a-df06-4cc6-b8e1-088d63a067cf_1456x1048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!F-ja!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6f8f82a-df06-4cc6-b8e1-088d63a067cf_1456x1048.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!F-ja!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6f8f82a-df06-4cc6-b8e1-088d63a067cf_1456x1048.jpeg" width="1456" height="1048" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d6f8f82a-df06-4cc6-b8e1-088d63a067cf_1456x1048.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1048,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:94127,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!F-ja!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6f8f82a-df06-4cc6-b8e1-088d63a067cf_1456x1048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!F-ja!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6f8f82a-df06-4cc6-b8e1-088d63a067cf_1456x1048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!F-ja!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6f8f82a-df06-4cc6-b8e1-088d63a067cf_1456x1048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!F-ja!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd6f8f82a-df06-4cc6-b8e1-088d63a067cf_1456x1048.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>Because of laws like HIPAA, healthcare has had a head start in changing its approach to handling personally identifiable information (<a href="https://www.dol.gov/general/ppii">PII</a>) and other sensitive information, while still leveraging technology and working with third parties to streamline processes. And they&#8217;ve figured out how to do this without sharing their valuable data. This is contradictory to the long-held belief that SaaS companies require customer data to improve services and get accurate, unbiased insights&#8212;it&#8217;s simply not the case.</p><p>Though it sounds implausible that less data sharing would equate to more specialization, it&#8217;s a reality that technologies like natural language processing (<a href="https://en.wikipedia.org/wiki/Natural_language_processing">NLP</a>) make it possible to achieve. Advances in areas like transfer learning can now be used to build models and then locally optimize them. This is a unique approach that not only safeguards patient information, but also makes it possible to deliver highly personalized care.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.talby.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.talby.com/subscribe?"><span>Subscribe now</span></a></p><p></p><p>Transfer learning is a machine learning method in which a model is developed for a task and is then reused as the starting point for a model on another secondary task. Domain experts, like doctors and other professionals, can leverage the pretrained models on their organization&#8217;s data to localize them, reflecting their own patient populations, and they can do it without the need for a data scientist or compromising their most valuable asset: their data.</p><p>When companies share customer information, whether it be with a partner or vendor, it identifies individuals personally. In the case of healthcare, this includes medical information. The vulnerability of that information is in the hands of that trusted third party, whether they have strong security measures in place or not. The risk of a data breach, <a href="https://www.helpnetsecurity.com/2020/07/23/human-error-cybersecurity/">human error</a>, or information leaking is very real, especially if your data is all stored in one place.</p><p>On the flip side, access to this data becomes another challenge. Figuring out who within an organization or department should have access to the data and for what purpose often takes longer than the timespan in which the data is relevant. The point being: this information is risky to store and difficult to access. But most importantly, it&#8217;s just not necessary to yield the health and business insights that matter.</p><p>Beyond privacy, localizing AI and NLP models not only mitigates risks with storing and accessing data, but it also empowers providers to deliver better care. Here&#8217;s why: most clinical evidence is based on research and clinical trials done on white males. This is true even for pregnancy-induced conditions, like gestational diabetes, and their treatments. The one-size-fits-all criteria is not applicable in medicine, so it&#8217;s understandable why this data would be problematic.</p><p>Starting with a general AI model enables caregivers to work within the parameters of a certain condition, such as kidney disease, and then fine-tune it to a given patient population. This level of personalization provides many benefits, and one of them is better serving marginalized groups. For example, black patients, who are often under-insured, are underrepresented in general models.</p><p>Transfer learning allows for a level of specialization to treat this population more accurately. This can be applied to all patients, conditions, symptoms, and can be done down to the hospital or clinic level.</p><p>Factors like age, disease distribution, and social determinants of health play a big role in equitable access to healthcare, and AI is the only realistic way to distill the data in a way that enables us to make sense of it. But how we responsibly handle this data has to be a big part of the conversation &#8211; and making sure it stays in the right hands is a good first step. Better yet, transfer learning makes it possible to glean the same, in some cases, better insights without the risk of mishandling or compromising sensitive data.</p><p>While strict regulations in healthcare are nothing new, as we hear about more high-profile data breaches and ransomware attacks, all industries are going to have to rethink their data privacy hygiene. Additionally, as more personalized care and customer experiences become paramount to serving different populations, AI and NLP are technologies that modern businesses must get acquainted with.</p><p>Striking a balance between privacy and personalization is not easy, but applications in healthcare have proven that it&#8217;s possible and can even improve outcomes.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://aiinhealthcare.substack.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share AI in Healthcare&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://aiinhealthcare.substack.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share AI in Healthcare</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Model Governance: A Checklist for Getting AI Safely to Production]]></title><description><![CDATA[The pressure to get AI models into production is real and as the AI arms race continues. Enterprises must keep responsible AI practices in place to mitigate harmful biases.]]></description><link>https://www.talby.com/p/model-governance-a-checklist-for</link><guid isPermaLink="false">https://www.talby.com/p/model-governance-a-checklist-for</guid><dc:creator><![CDATA[David Talby]]></dc:creator><pubDate>Wed, 19 May 2021 15:15:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd162ce8-5125-440b-b9bd-f81f8a0c1a83_1456x1048.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Putting AI models into production is notoriously difficult&#8212;and research reflects that. A <strong><a href="https://www.newvantage.com/wp-content/uploads/2020/01/NewVantage-Partners-Big-Data-and-AI-Executive-Survey-2020-1.pdf">NewVantage Partners survey</a></strong> found that the percentage of firms investing greater than $50 million in Big Data and AI initiatives is up to 64.8%, with a total of 98.8% of firms investing. Despite this, only 14.6% report that they have deployed AI capabilities into widespread production. So what&#8217;s actually holding enterprises back from realizing the full capabilities of their AI and <strong><a href="https://opendatascience.com/free-download-the-odsc-guide-to-machine-learning/">machine learning</a></strong> investments?</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!sC5-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd162ce8-5125-440b-b9bd-f81f8a0c1a83_1456x1048.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!sC5-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd162ce8-5125-440b-b9bd-f81f8a0c1a83_1456x1048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!sC5-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd162ce8-5125-440b-b9bd-f81f8a0c1a83_1456x1048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!sC5-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd162ce8-5125-440b-b9bd-f81f8a0c1a83_1456x1048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!sC5-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd162ce8-5125-440b-b9bd-f81f8a0c1a83_1456x1048.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!sC5-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd162ce8-5125-440b-b9bd-f81f8a0c1a83_1456x1048.jpeg" width="1456" height="1048" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dd162ce8-5125-440b-b9bd-f81f8a0c1a83_1456x1048.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1048,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:122121,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!sC5-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd162ce8-5125-440b-b9bd-f81f8a0c1a83_1456x1048.jpeg 424w, https://substackcdn.com/image/fetch/$s_!sC5-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd162ce8-5125-440b-b9bd-f81f8a0c1a83_1456x1048.jpeg 848w, https://substackcdn.com/image/fetch/$s_!sC5-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd162ce8-5125-440b-b9bd-f81f8a0c1a83_1456x1048.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!sC5-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd162ce8-5125-440b-b9bd-f81f8a0c1a83_1456x1048.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p></p><p>For starters, this is relatively new territory. As an industry, we&#8217;ve had approximately 40-years of experience creating best practices and tools for storing, versioning, collaborating, securing, testing, and building software source code. Alternatively, we&#8217;ve only had about four years doing so for AI models. This is a significant gap&#8212;and one we need to bridge quickly in order for AI innovation to accelerate as it has over the last several years.</p><p>This isn&#8217;t anything we haven&#8217;t done before. The software industry has gone through multiple generations of tools and methodologies for software code governance, from configuration control, collaboration, test processes, and build processes, to code repositories, and metadata management. Now, we&#8217;re just starting to explore these issues for machine learning models, and like any new technology, we&#8217;re already experiencing the growing pains. Lack of AI governance is already hurting the productivity of data science teams and preventing the safe deployment and operation of models in production.</p><p>One of the main challenges in productionizing AI and ML models is accommodating the unique and sometimes complex environments in which they operate. Organizations also need to be cognizant of legal and compliance challenges that come along with implementing AI technologies. Just this week, The European Commission proposed measures that would ban certain high-risk AI applications in the EU, while others will face stricter constraints that threaten hefty fines for companies that don&#8217;t comply (<em><strong><a href="https://www.bloomberg.com/news/articles/2021-04-21/facial-recognition-other-risky-ai-set-for-constraints-in-eu?utm_source=google&amp;utm_medium=bd&amp;cmpId=google">Bloomberg</a></strong></em>). As AI continues to proliferate, we&#8217;ll see more legislation regarding governance, and this will need to be part of the equation in productionizing AI models.</p><p>Speaking with direct experience in the healthcare and life sciences space, adhering to strict industry standards and requirements becomes even more important when human lives and health are at stake. Fortunately, there are several widely accepted best practices for model governance, along with freely available tools available to apply them. These tools can help empower teams to go beyond experimentation to successfully deploy models, and I&#8217;ll cover what these are in the upcoming session, &#8220;<strong><a href="https://odsc.com/speakers/model-governance-a-checklist-for-getting-ai-safely-to-production/?__hstc=39712252.3cb603648cf8e6e4ad711e1fecfb468a.1686064321426.1686064321426.1686064321426.1&amp;__hssc=39712252.1.1686064321426&amp;__hsfp=2282684124">Model Governance: A Checklist for Getting AI Safely to Production</a></strong>,&#8221; at <strong><a href="https://odsc.com/europe/?__hstc=39712252.3cb603648cf8e6e4ad711e1fecfb468a.1686064321426.1686064321426.1686064321426.1&amp;__hssc=39712252.1.1686064321426&amp;__hsfp=2282684124">ODSC Europe </a></strong>in June.&nbsp;</p><p>One important component I&#8217;ll discuss is the need for better searchability and collaboration among the AI community. For example, storing modeling assets in a searchable catalog, including notebooks, datasets, resulting measurements, hyper-parameters, and other metadata. Enabling reproducibility and sharing of experiments across data science team members is another area that will be advantageous to those trying to get their projects to production-grade.</p><p>Another touchstone of AI governance is rigorous testing and retesting to ensure models behave the same in production as they do in research. Versioning models that have advanced beyond an experiment to a release candidate, testing those candidates for accuracy, bias, and stability, and validating models before launching in new geographies or populations, are several best practices that all organizations productionizing AI should be thinking about.</p><p>Security and compliance should be baked into a successful AI strategy from the beginning, and this is another important area I&#8217;ll discuss in my upcoming presentation.&nbsp;</p><p>Role-based access control and an approval workflow for model release, and storing and providing all metadata needed for a full audit trail are just a few of the security measures that should be put into place before a model is considered production-ready. This is especially important in highly-regulated industries, such as healthcare and finance.</p><p>The pressure to get AI models into production is real&#8212;financially, competitively, operationally&#8212;and as the AI arms race continues, it&#8217;s only increasing. Despite this, enterprises must keep responsible AI practices in place to mitigate harmful and potentially dangerous biases and inaccuracies. To learn more about best practices and valuable tools to help safely govern AI and get models into production safely, <strong><a href="https://odsc.com/europe/?__hstc=39712252.3cb603648cf8e6e4ad711e1fecfb468a.1686064321426.1686064321426.1686064321426.1&amp;__hssc=39712252.1.1686064321426&amp;__hsfp=2282684124">register for ODSC Europe</a></strong>&nbsp;and be sure to <strong><a href="https://odsc.com/speakers/model-governance-a-checklist-for-getting-ai-safely-to-production/?__hstc=39712252.3cb603648cf8e6e4ad711e1fecfb468a.1686064321426.1686064321426.1686064321426.1&amp;__hssc=39712252.1.1686064321426&amp;__hsfp=2282684124">check out my session</a></strong>!</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kkgD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13d1c2eb-e690-4b1d-8910-5376399d4a7c_1024x538.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kkgD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13d1c2eb-e690-4b1d-8910-5376399d4a7c_1024x538.png 424w, https://substackcdn.com/image/fetch/$s_!kkgD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13d1c2eb-e690-4b1d-8910-5376399d4a7c_1024x538.png 848w, https://substackcdn.com/image/fetch/$s_!kkgD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13d1c2eb-e690-4b1d-8910-5376399d4a7c_1024x538.png 1272w, https://substackcdn.com/image/fetch/$s_!kkgD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13d1c2eb-e690-4b1d-8910-5376399d4a7c_1024x538.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kkgD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13d1c2eb-e690-4b1d-8910-5376399d4a7c_1024x538.png" width="1024" height="538" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/13d1c2eb-e690-4b1d-8910-5376399d4a7c_1024x538.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:538,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!kkgD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13d1c2eb-e690-4b1d-8910-5376399d4a7c_1024x538.png 424w, https://substackcdn.com/image/fetch/$s_!kkgD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13d1c2eb-e690-4b1d-8910-5376399d4a7c_1024x538.png 848w, https://substackcdn.com/image/fetch/$s_!kkgD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13d1c2eb-e690-4b1d-8910-5376399d4a7c_1024x538.png 1272w, https://substackcdn.com/image/fetch/$s_!kkgD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F13d1c2eb-e690-4b1d-8910-5376399d4a7c_1024x538.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>David Talby, PhD, is the founder and CTO of <strong><a href="https://www.johnsnowlabs.com/">John Snow Labs</a></strong>, the AI and NLP for healthcare company and developer of the Spark NLP library. He has dedicated his career to helping companies build real-world AI systems, turning recent scientific advances into products and services. He specializes in applying machine learning, deep learning, and natural language processing in healthcare.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.talby.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading AI in Healthcare! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p>]]></content:encoded></item><item><title><![CDATA[Summary of the 2021 Healthcare AI Industry Survey]]></title><description><![CDATA[The 2021 Healthcare AI Survey reveals the industry's focus on NLP, data integration, and BI. Accuracy, privacy, and model tunability are key for AI system evaluation.]]></description><link>https://www.talby.com/p/summary-of-the-2021-healthcare-ai</link><guid isPermaLink="false">https://www.talby.com/p/summary-of-the-2021-healthcare-ai</guid><dc:creator><![CDATA[David Talby]]></dc:creator><pubDate>Wed, 28 Apr 2021 05:18:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!XPid!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd65b04fe-3d10-4f96-aced-51ed0f6894f5.avif" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Each of us only gets to experience a tiny part of the world &#8211; <a href="https://en.wikipedia.org/wiki/Blind_men_and_an_elephant">touching a different part of the elephant</a>. Therefore, it&#8217;s essential to come together occasionally and assess industry-wide trends.</p><p>The new&nbsp;<a href="https://www.itproportal.com/features/the-state-of-ai-in-healthcare-five-key-findings-enterprises-should-know/">2021 Healthcare AI Survey</a>&nbsp;from Gradient Flow, sponsored by my company, aims to do just that: unearth these areas to provide a better overview of where we actually stand when it comes to AI in healthcare.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!XPid!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd65b04fe-3d10-4f96-aced-51ed0f6894f5.avif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!XPid!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd65b04fe-3d10-4f96-aced-51ed0f6894f5.avif 424w, https://substackcdn.com/image/fetch/$s_!XPid!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd65b04fe-3d10-4f96-aced-51ed0f6894f5.avif 848w, https://substackcdn.com/image/fetch/$s_!XPid!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd65b04fe-3d10-4f96-aced-51ed0f6894f5.avif 1272w, https://substackcdn.com/image/fetch/$s_!XPid!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd65b04fe-3d10-4f96-aced-51ed0f6894f5.avif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!XPid!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd65b04fe-3d10-4f96-aced-51ed0f6894f5.avif" width="626" height="357" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d65b04fe-3d10-4f96-aced-51ed0f6894f5.avif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:357,&quot;width&quot;:626,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:7560,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/avif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!XPid!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd65b04fe-3d10-4f96-aced-51ed0f6894f5.avif 424w, https://substackcdn.com/image/fetch/$s_!XPid!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd65b04fe-3d10-4f96-aced-51ed0f6894f5.avif 848w, https://substackcdn.com/image/fetch/$s_!XPid!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd65b04fe-3d10-4f96-aced-51ed0f6894f5.avif 1272w, https://substackcdn.com/image/fetch/$s_!XPid!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd65b04fe-3d10-4f96-aced-51ed0f6894f5.avif 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h2><strong>Key Technologies: Data Integration, BI, and NLP</strong></h2><p>One of the most telling findings here is the shift in AI technologies that organizations are currently using or plan to implement in 2021. Respondents to the survey said they wanted to have natural language processing (NLP) (36%), data integration (45%), and business intelligence (BI) (33%) as the three most widely applied technologies in their businesses by the close of 2021.</p><p>These aren&#8217;t just statements of intent &#8212; they&#8217;re backed by money. The&nbsp;<a href="https://www.healthcaretechoutlook.com/news/the-2020-nlp-industry-survey-finds-increasing-enterprise-investment-in-natural-language-processing-despite-pandemicimpacted-it-budgets-pnid-66.html">2020 NLP Industry Survey</a>, published by the same group in the Fall of 2020, reported that more than half of technology leaders &#8212; the people overseeing AI investment &#8212; have increased the budget allocated to NLP between 2019 to 2020.</p><p>Paired with data integration and BI, it&#8217;s clear that healthcare systems are getting more serious about the value of unlocking their data &#8212; structured and unstructured. NLP, BI, and data integration solve some of the biggest problems the healthcare industry faces, from serving as connective tissue between siloed data sources (in electronic health records, free text, imaging, and more) to safeguarding personally identifiable information (PII) and making sure it stays private, for highly regulated industries, such as healthcare and pharma, AI-powered technologies like the aforementioned will be critical to operations and safety.&nbsp;</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.talby.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.talby.com/subscribe?"><span>Subscribe now</span></a></p><p></p><h2><strong>Evaluating AI Systems: Accuracy, No Data Sharing, and Model Tuning</strong></h2><p>Another encouraging finding is the criteria most important to healthcare users when evaluating which AI technologies to explore further. The top three criteria for technical leaders when evaluating such technologies and tools were providing extreme accuracy (48%), ensuring no data is shared with their software providers and vendors whatsoever (44%), and having the ability to train and tune the models to match their own datasets and use cases. Privacy, trainability, and accuracy are essential for any AI solution, especially when dealing with medical information that can impact care delivery. Access to data and ownership of specialized models are also primary sources of intellectual property that AI organizations build.</p><p>Accuracy, in particular, is a big topic of interest in clinical applications. Here&#8217;s an example of why this is so important: According to a report from the&nbsp;<em><a href="https://pubmed.ncbi.nlm.nih.gov/25527336/">Journal of General Internal Medicine</a></em>, "Collection of data on race, ethnicity, and language preference is required as part of the 'meaningful use' of electronic health records (EHRs). These data serve as a foundation for interventions to reduce health disparities." The paper found important inaccuracies in what was recorded in EHRs and what patients reported. For example, "30% of whites self-reported identification with at least one other racial or ethnic group than was reflected in the EHR, as did 37% of Hispanics, and 41% of African Americans." This is a problem when you consider patients from certain backgrounds and ethnicities may have a greater risk of developing certain comorbidities or lack access to appropriate care. This isn&#8217;t necessarily an AI problem but a data problem &#8212; and data needs to be accurate for AI to work its magic.</p><p></p><h2><strong>Evaluating Vendors: Healthcare-Specific &amp; Production-Ready AI</strong></h2><p>This emphasis on accuracy also feeds into what technology leaders seek when evaluating software libraries or SaaS solutions to fuel their AI initiatives. Per the 2021 Healthcare AI Survey, healthcare-specific models and algorithms (42%) and a production-ready codebase (40%) topped the list when considering a solution. Healthcare-specific models are familiar with the nuances of medical data, from clinical jargon and language to billing codes and other data from nontext entities, such as X-rays. Additionally, production-grade products empower users, from data scientists to clinicians, to integrate AI technologies into their daily workflows with a reduced risk of problems or inaccuracies. After all, they&#8217;ve already been tested and proven and are being updated over time.&nbsp;</p><p>As AI begins to trickle down to use by patients with the advent of chatbots, automated appointment scheduling, or obtaining access to their medical records, it&#8217;s essential to be aware of both the value and challenges this technology can bring. A chatbot not being able to connect a person to the correct department may not seem like a big deal &#8212; unless the patient is experiencing an acute medical event that needs immediate care. The varying levels of severity in medical settings make it obvious why factors like accuracy, healthcare-specific models, and production-ready code bases could be the difference not just between a successful AI deployment and a failed one but, in some cases, between life and death.</p><p>With the global AI in healthcare market size expected to grow from just under<a href="https://www.marketsandmarkets.com/Market-Reports/artificial-intelligence-healthcare-market-54679303.html?gclid=CjwKCAiAp4KCBhB6EiwAxRxbpFBz30E2wHi3KJCeyDhv3d1fyVZB606t-na38LvMFdeScz8dACIfeBoCK44QAvD_BwE">&nbsp;$5 billion in 2020 to $45.2 billion by 2026</a>, the investments and recent use cases for this technology are proof that AI is here to stay. But with many of these cutting-edge technologies still in their infancy and many challenges ahead, the jury is still out on what the next few years hold for AI adoption, key players, and clinical advances for the healthcare industry. Thankfully, with more research at our fingertips, we&#8217;re a bit closer to getting there.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://aiinhealthcare.substack.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share AI in Healthcare&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://aiinhealthcare.substack.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share AI in Healthcare</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Setting a Higher Standard for State-Of-The-Art Applied AI]]></title><description><![CDATA[We'll explore why "State-of-the-art" in academia differs from industry standards, requiring reproducibility, deployment, and open code. Transparency is crucial for trustworthy AI.]]></description><link>https://www.talby.com/p/setting-a-higher-standard-for-state</link><guid isPermaLink="false">https://www.talby.com/p/setting-a-higher-standard-for-state</guid><dc:creator><![CDATA[David Talby]]></dc:creator><pubDate>Sun, 07 Feb 2021 06:13:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!ffXN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5cf7cffc-5c3e-4b2a-9edf-0509c3b6ef57.avif" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ffXN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5cf7cffc-5c3e-4b2a-9edf-0509c3b6ef57.avif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ffXN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5cf7cffc-5c3e-4b2a-9edf-0509c3b6ef57.avif 424w, https://substackcdn.com/image/fetch/$s_!ffXN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5cf7cffc-5c3e-4b2a-9edf-0509c3b6ef57.avif 848w, https://substackcdn.com/image/fetch/$s_!ffXN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5cf7cffc-5c3e-4b2a-9edf-0509c3b6ef57.avif 1272w, https://substackcdn.com/image/fetch/$s_!ffXN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5cf7cffc-5c3e-4b2a-9edf-0509c3b6ef57.avif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ffXN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5cf7cffc-5c3e-4b2a-9edf-0509c3b6ef57.avif" width="626" height="626" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5cf7cffc-5c3e-4b2a-9edf-0509c3b6ef57.avif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:626,&quot;width&quot;:626,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:22270,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/avif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ffXN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5cf7cffc-5c3e-4b2a-9edf-0509c3b6ef57.avif 424w, https://substackcdn.com/image/fetch/$s_!ffXN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5cf7cffc-5c3e-4b2a-9edf-0509c3b6ef57.avif 848w, https://substackcdn.com/image/fetch/$s_!ffXN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5cf7cffc-5c3e-4b2a-9edf-0509c3b6ef57.avif 1272w, https://substackcdn.com/image/fetch/$s_!ffXN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5cf7cffc-5c3e-4b2a-9edf-0509c3b6ef57.avif 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Over the past four years, I&#8217;ve participated in the technical due diligence of nearly twenty companies claiming some proprietary artificial intelligence (AI) &#8220;secret sauce.&#8221; After evaluating them, the results were split evenly between those showing smoke and mirrors, those on their way there in a year or two, and those with a sound application of machine learning.</p><p>Every company affirmed that its artificial intelligence (AI) was next-generation, world-class, cutting-edge, breakthrough, enterprise-grade, market-leading, or some other&nbsp;<a href="https://www.davidmeermanscott.com/blog/2006/10/the_gobbledygoo.html">gobbledygook</a>. Given the uniform distribution of actual AI capabilities, it&#8217;s no wonder that the average technology buyer has little trust in such assertions. My dataset, although&nbsp;small, suggests they should.</p><p>The term &#8220;state-of-the-art,&#8221; on the other hand, has real, concrete meaning in academia: The best documented, peer-reviewed results obtained on a problem for a reproducible benchmark. It&#8217;s not a claim you can make without proof &#8212; or sustain over time without continuing to innovate. In AI, the website&nbsp;<em><a href="https://paperswithcode.com/">Papers with Code</a></em>&nbsp;curates more than 3,600 state-of-the-art benchmarks and almost 40,000 papers that are ranked for the results they produce, covering computer vision, language, speech, music, games, robotics, and more.</p><p>However, it has become clear that the bar for state-of-the-art applied AI used in real industry systems must be higher. After all, most academic papers will never see the light of day in terms of actual industrial applications. Delivering real-world AI systems requires more than beating an academic benchmark in a controlled experimental setting.&nbsp;</p><p>To help organizations with this, it&#8217;s important to understand what makes state-of-the-art applied AI and the criteria used to define it. Here, we&#8217;ll explore the three benchmarks technology leaders should consider before selecting the best solution for their business needs or building their own solution that lives up to its state-of-the-art promises.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.talby.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.talby.com/subscribe?"><span>Subscribe now</span></a></p><p></p><h2><strong>It&#8217;s peer-reviewed and reproducible</strong></h2><p>The first criterion requires your state-of-the-art software to deliver the best accuracy on benchmarks that are public, reproducible, and trainable. Benchmarks should be designed by a third party, not by the vendor themselves or an affiliated team. It must have a public baseline that keeps rising as multiple teams compete to improve it. For example, the&nbsp;<a href="http://nlpprogress.com/">NLP-progress</a>&nbsp;website tracks such benchmarks in natural language processing.</p><p>Second, the solution should be reproducible, meaning anyone outside the provider&#8217;s team should be able to reproduce the same results from scratch. This should include the choice of accuracy metric, hyperparameters, train/test split, software version or hardware used, and so on.</p><p>Lastly, trainability is an important factor. It should be possible to reproduce both the model training and the inference stages. In practice, a top-ranked solution may not match your use case. In a healthcare setting, for example, you may care about identifying cardiology-specific terms, which no current benchmark specializes in. It&#8217;s also likely that new papers will outperform the current state-of-the-art results within a few months, so keep that in mind when evaluating a solution.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.linkedin.com/groups/7010492/&quot;,&quot;text&quot;:&quot;Join The LinkedIn Group&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.linkedin.com/groups/7010492/"><span>Join The LinkedIn Group</span></a></p><p></p><h2><strong>It&#8217;s in production at multiple companies</strong></h2><p>You cannot claim an AI system to be &#8220;applied state-of-the-art&#8221; if it isn&#8217;t &#8220;applied&#8221; in multiple, real production systems. Real-world data is different from academic data &#8212; it&#8217;s more diverse, noisy, dynamic, and biased. The model that performs best academically is not always the best performer in practice. This is why the industry needs data scientists, tools, and processes that train custom models. While academic benchmarks are useful, they have limitations.</p><p>Additionally, production readiness has its own set of requirements. In this case, multiple independent teams will have evaluated the solution&#8217;s code quality, error handling, logging, monitoring, scalability, security, privacy, deployment, upgrade process, compute, and memory use &#8212; plus aspects of bias, explainability, and concept drift.</p><p>Having multiple deployments in multiple organizations also validates that you haven&#8217;t built a one-off custom solution. There&#8217;s nothing wrong with that, but generalizing one custom solution to a reusable software package requires a different level of expertise. Having models that generalize is required for claiming state-of-the-art applied AI.</p><p></p><h2><strong>It&#8217;s Open</strong></h2><p>The third criterion for state-of-the-art applied AI is that a material portion of the codebase should be open. It doesn&#8217;t have to be freely available under a permissive license, but others should be able to inspect it independently. This is important because it shows that you, or the solution you&#8217;re evaluating, actually built it. Many have claimed deep AI expertise while their code is called an existing cloud API or pre-trained model. But it&#8217;s misleading to allege you&#8217;re a computer vision expert because you can search TensorFlow Hub. There is nothing wrong with providing an easy-to-use solution for end users &#8212; just be forthcoming about it.</p><p>Providing an open-source or open-core solution also validates whether other people are independently choosing to use it. Claiming that your solution is useful or easy to use is one thing, but getting others to stake their projects on it is another. This requires you to provide the right documentation, integration, examples, and community support.</p><p>Another advantage of making source code open is enabling others to evaluate the code and model quality. Public source code encourages a higher standard of software engineering &#8212; from unit tests and minimal dependencies to machine learning aspects of trainable, robust, and explainable models.</p><p>This level of transparency and third-party evaluation will uncover that your software is far from perfect: It only plays nice as part of certain architectures, requires tradeoffs between accuracy and speed, reuses other software packages, only scales well to a certain point, isn&#8217;t cost-effective at all scale levels and has a few experimental features. This is all fine and expected &#8212; all software is like that. Real state-of-the-art solutions on this.</p><p></p><h2><strong>If the AI industry wants to shake off buyers&#8217; perception of being sold snake oil, it should stop selling it</strong></h2><p>For users, it&#8217;s essential to be aware of what makes a solution truly state-of-the-art and which is just sprinkling AI on as an afterthought. Let&#8217;s set a high bar for what great applied AI means and take the long path to achieve it.&nbsp;</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://aiinhealthcare.substack.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share AI in Healthcare&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://aiinhealthcare.substack.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share AI in Healthcare</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[How Healthcare NLP Evolved from 2020 to 2021]]></title><description><![CDATA[Healthcare embraces natural language processing (NLP) to improve patient care. With growing NLP budgets, accuracy is a top priority.]]></description><link>https://www.talby.com/p/how-healthcare-nlp-evolved-from-2020</link><guid isPermaLink="false">https://www.talby.com/p/how-healthcare-nlp-evolved-from-2020</guid><dc:creator><![CDATA[David Talby]]></dc:creator><pubDate>Mon, 11 Jan 2021 06:57:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!GcOD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc29755df-72ff-4edf-96c6-90e4814aed15.avif" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GcOD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc29755df-72ff-4edf-96c6-90e4814aed15.avif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GcOD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc29755df-72ff-4edf-96c6-90e4814aed15.avif 424w, https://substackcdn.com/image/fetch/$s_!GcOD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc29755df-72ff-4edf-96c6-90e4814aed15.avif 848w, https://substackcdn.com/image/fetch/$s_!GcOD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc29755df-72ff-4edf-96c6-90e4814aed15.avif 1272w, https://substackcdn.com/image/fetch/$s_!GcOD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc29755df-72ff-4edf-96c6-90e4814aed15.avif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GcOD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc29755df-72ff-4edf-96c6-90e4814aed15.avif" width="1456" height="727" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c29755df-72ff-4edf-96c6-90e4814aed15.avif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:727,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:98634,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/avif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GcOD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc29755df-72ff-4edf-96c6-90e4814aed15.avif 424w, https://substackcdn.com/image/fetch/$s_!GcOD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc29755df-72ff-4edf-96c6-90e4814aed15.avif 848w, https://substackcdn.com/image/fetch/$s_!GcOD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc29755df-72ff-4edf-96c6-90e4814aed15.avif 1272w, https://substackcdn.com/image/fetch/$s_!GcOD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc29755df-72ff-4edf-96c6-90e4814aed15.avif 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>Few industries have embraced natural language processing (NLP) as openly as healthcare. With the ability to identify new variants of COVID-19 and help speed up clinical trials for the vaccine, the pandemic is just one example of what NLP is capable of achieving. And while new research points to NLP budgets growing significantly across vertical industries, locations, company sizes, and maturity levels, healthcare is leading the pack.</p><p>Big strides have been made in AI and NLP over the last year, but despite progress and increased investments, many of the challenges and barriers to entry remain the same.&nbsp;</p><p>The second annual <a href="https://gradientflow.com/2021nlpsurvey/?utm_source=jsl&amp;utm_medium=prrelease">NLP Industry Survey </a>explores the triumphs, challenges, applications, and tools shaping NLP adoption.&nbsp;</p><p>The largest industry representation (17%) in this survey came from healthcare respondents, even greater than those in technology fields, which is reflective of overall industry adoption. As such, by analyzing how NLP has evolved over the last year in the healthcare space, we can get a glimpse of what&#8217;s on the horizon for the technology.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.talby.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.talby.com/subscribe?"><span>Subscribe now</span></a></p><p></p><h2><strong>NLP Budgets Keep Growing</strong></h2><p>While 60% of tech leaders indicated their NLP budgets grew by at least 10%, a majority of healthcare technologists are spending 10-30% more on NLP compared to last year. It&#8217;s encouraging to see that even in the wake of the pandemic, IT investments in areas like NLP were still strong. It&#8217;s even possible that the circumstances of last year proved how valuable the technology can be.&nbsp;</p><p>For example, NLP algorithms are now able to generate protein sequences and predict virus mutations, including key changes that help the coronavirus evade the immune system, according to <a href="https://www.technologyreview.com/2021/01/14/1016162/ai-language-nlp-coronavirus-hiv-flu-mutations-antinbodies-immune-vaccines/">MIT research</a>. <a href="https://f.hubspotusercontent20.net/hubfs/1794529/AI%20Case%20Studies/John%20Snow%20Labs%20-%20Kaiser%20Permenente%20Case%20Study.pdf?utm_campaign=2020%20Survey%20Downloads&amp;utm_medium=email&amp;_hsmi=131196563&amp;_hsenc=p2ANqtz--YF7HdJi5f___1Mhr63_Y-jS1AM7EiyFtSII3uMp4LqIs4eLqPZymi51B12CqVm5J62i9TE2pNtuqCUzLbq36Erc56qg&amp;utm_content=131196563&amp;utm_source=hs_email">Kaiser Permanente</a>uses NLP for extracting key features from EHR notes to optimize hospital patient flow &#8212; something critical to operations when healthcare organizations are overwhelmed with an influx of patients with differing levels of severity. These are just a few examples of what investments in NLP can achieve.&nbsp;</p><p></p><h2><strong>NLP Use Cases are Expanding</strong></h2><p>Aligned with respondents from other industries, healthcare tech leaders cited named entity recognition (NER) and document classification as the primary use cases for NLP. Looking ahead, we can expect growth in Q&amp;A and natural language generation use cases powered by large language prediction models and related open-source alternatives. This will bring a greater level of humanity to NLP, as users will be able to speak in plain language directly to the technology and get a prompt, contextually relevant response.&nbsp;</p><p>De-identification is another use case that&#8217;s popular among highly regulated industries, such as healthcare. This enables users to redact personally identifiable information &#8212; names, addresses, social security numbers &#8212; subject to regulations like HIPAA and GDPR. De-identification will likely gain steam as a use case for other industries as businesses develop better data privacy practices. De-identification can also remove certain types of spurious correlations or biases from models, so will likely become more commonplace as Responsible AI practices become mainstream.</p><p></p><h2><strong>NLP Challenges: Accuracy Above All</strong></h2><p>When dealing with patients and their care, it&#8217;s clear why accuracy is the top priority users consider when evaluating an NLP solution. That said, it&#8217;s also one of the biggest challenges users face &#8212; 44% of them to be exact. Accuracy refers to the effectiveness of pre-trained models that come with NLP libraries, and it&#8217;s critical as results from previous tasks and models are used downstream.&nbsp;</p><p>Not only is getting it right from the get-go paramount but being able to tune models over time is equally important in order to prevent degradation and understand domain-specific jargon. As healthcare is an industry with unique challenges and nuances, this often requires a data scientist as well as a domain expert for optimal results. Because of the changing nature of data, regulations, and discoveries, even as NLP technology matures, accuracy will likely remain a challenge in years to come.&nbsp;</p><p></p><h2><strong>NLP Tools: Libraries and Cloud Use&nbsp;&nbsp;</strong></h2><p>Among the NLP libraries in use, Spark NLP remains the most popular. It is used by nearly a third (31%) of general respondents and 59% of healthcare respondents. Additionally, the use of NLP cloud services is rising steadily, with a 23% increase for the Top 4 cloud providers &#8212; AWS, Azure, Google, and IBM &#8212; since 2020. Even so, there are serious concerns by survey respondents about the pricing models for these cloud services as NLP practices scale. For solutions that need to process many documents on a regular basis, these cloud services are perceived as prohibitively expensive.</p><p>While progress has endured the global pandemic, a worldwide shortage of AI talent, and ongoing concerns about data sharing and privacy, <a href="https://hitconsultant.net/tag/natrual-language-processing/">NLP </a>has proven its here to stay. Although it&#8217;s likely that challenges such as accuracy, scalability, and cost will persist into the future, new exciting use cases and advances in the technology will be interesting to watch, with the healthcare industry forging the path forward.&nbsp;</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://aiinhealthcare.substack.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share AI in Healthcare&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://aiinhealthcare.substack.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share AI in Healthcare</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Why Your AI Projects Won’t Scale – and it’s Not the Technical Talent Gap]]></title><description><![CDATA[AI is a relatively new technology and proper training and education takes time, so be sure that you have a strategy in place to onboard your entire team to unlock the benefits of AI.]]></description><link>https://www.talby.com/p/why-your-ai-projects-wont-scale-and</link><guid isPermaLink="false">https://www.talby.com/p/why-your-ai-projects-wont-scale-and</guid><dc:creator><![CDATA[David Talby]]></dc:creator><pubDate>Wed, 23 Dec 2020 16:18:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!MGEN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50bacfbb-de4c-434e-9a13-c0e7a10b6903_2300x1300.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!MGEN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50bacfbb-de4c-434e-9a13-c0e7a10b6903_2300x1300.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!MGEN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50bacfbb-de4c-434e-9a13-c0e7a10b6903_2300x1300.jpeg 424w, https://substackcdn.com/image/fetch/$s_!MGEN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50bacfbb-de4c-434e-9a13-c0e7a10b6903_2300x1300.jpeg 848w, https://substackcdn.com/image/fetch/$s_!MGEN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50bacfbb-de4c-434e-9a13-c0e7a10b6903_2300x1300.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!MGEN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50bacfbb-de4c-434e-9a13-c0e7a10b6903_2300x1300.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!MGEN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50bacfbb-de4c-434e-9a13-c0e7a10b6903_2300x1300.jpeg" width="1456" height="823" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/50bacfbb-de4c-434e-9a13-c0e7a10b6903_2300x1300.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:823,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:261279,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!MGEN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50bacfbb-de4c-434e-9a13-c0e7a10b6903_2300x1300.jpeg 424w, https://substackcdn.com/image/fetch/$s_!MGEN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50bacfbb-de4c-434e-9a13-c0e7a10b6903_2300x1300.jpeg 848w, https://substackcdn.com/image/fetch/$s_!MGEN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50bacfbb-de4c-434e-9a13-c0e7a10b6903_2300x1300.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!MGEN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50bacfbb-de4c-434e-9a13-c0e7a10b6903_2300x1300.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>For the last few years, companies have been increasingly experimenting with <a href="https://salestechstar.com/category/predictive-ai-artificial-intelligence/">artificial intelligence</a> (AI), advancing their computing and data analysis capabilities, investing in new technologies, and trying to reap the benefits of all that AI has to offer. We can expect interest and adoption of AI to become even more pervasive in the coming years, as the barriers to entry lower and more AI talent emerges. While many tools are at the disposal of enterprise organizations today, it&#8217;s important for businesses to accelerate efforts across their products and processes to ensure they don&#8217;t lose a competitive edge.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.talby.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.talby.com/subscribe?"><span>Subscribe now</span></a></p><p></p><p>It sounds simple enough&#8212;invest in the right technology and talent and let AI do the work. But having the right infrastructure and people in place is only half the battle. In fact, despite increased interest in, and adoption of AI in the enterprise, 85 percent of AI projects fail to deliver on their intended promises, according to research from <a href="https://www.techrepublic.com/article/why-85-of-ai-projects-fail/">Pactera Technologies</a>. So, while AI is poised to become a normal part of all business operations, many businesses still struggle to implement and scale their AI projects.</p><p>The real reason AI strategies won&#8217;t work or scale, as expected, all comes down to talent&#8212;but it&#8217;s not just data science talent, as many may think. While the AI skills gap or lack of resources to fund highly-priced AI talent is often at the center of the conversation, it&#8217;s more often than not the product, design, and business talent that stalls AI projects from being successful. As important as technical talent is, understanding how AI will work within a product and how it translates to better <a href="https://salestechstar.com/?s=customer+experience">customer experience</a> and new revenue is just as critical.</p><p>In healthcare, for example, we have algorithms that can read an X-ray as accurately as a human can&#8212;but integrating them into the clinical workflow is a real challenge. Being able to train and deploy accurate AI models doesn&#8217;t address the question of how to most effectively use them to help your customers. Doing this requires educating all organizational disciplines&#8212;sales, marketing, product, design, legal, customer success &#8212;on why this is useful and how it will impact their job function.</p><p>When done well, new capabilities unlocked by AI enable product teams to completely rethink the user experience. It&#8217;s the difference between adding Netflix or Spotify recommendations as a side feature, versus designing the user interface around content discovery. Or, more aspirationally, the difference between adding a lane departure alert to your new car versus building a self-driving vehicle that doesn&#8217;t have pedals or wheels.</p><p>A real instance of the challenges organizations face when implementing and scaling AI projects comes from a recent <a href="https://dl.acm.org/doi/pdf/10.1145/3313831.3376718">Google Research</a> paper about a new deep-learning model used to detect diabetic retinopathy from images of patients&#8217; eyes. Diabetic retinopathy, when untreated, causes blindness, but if detected early, can often be prevented. As a response, scientists trained a deep learning model model to identify early stages of diabetic retinopathy in patients from pictures of corneas from eye exams over the past 2-3 years.</p><p>While in theory, the trained model was at least as accurate as human specialists, this wasn&#8217;t the case when applied to clinics in rural Thailand. There, the quality of the machines were not as advanced as the kind Google had access to for model training. In some cases, there were not rooms in the clinic to perform the exams that were completely dark, as the trained model assumed. Some patients refused to take the test because of trust issues&#8212;the nurses weren&#8217;t trained to explain why this new test was necessary or patients were scared that a bad result would require them to spend another day going to a hospital for follow-up treatment. The lack of not only infrastructure, but cohesive education for employees, and understanding of practical limitations is a great example of the major gap between data science success and business success.</p><p>Successful AI products and services require applied skill in three layers. First, data scientists must be available, productively tooled, and have domain expertise and access to relevant data. While the technology is becoming well understood, from&nbsp; bias prevention, explainability, concept drift and similar issues, many teams are still struggling with this first layer of technical issues. Second, organizations must learn how to deploy and operate AI models in production. This requires DevOps, SecOps, and newly emerging &#8220;<a href="https://salestechstar.com/?s=AI+Ops">AI Ops</a>&#8221; tools and processes to be put in place so models continue working accurately in production over time. Third, product managers and business leaders must be involved from the get-go, in order to redesign new technical capabilities and how they will be applied to make customers and end users successful.</p><p>While there&#8217;s been tremendous progress in education and tooling over the past five years on the first layer (education and tooling for data scientists), we&#8217;re very early on in tackling the second layer (operating AI models in production). The third layer (design and product management) is far behind and becoming the most common barrier to AI success. Fortunately, these problems can be addressed and corrected with a few easy steps.</p><p>Tightening the business-wide AI talent gap comes down to investing in hands-on education. Outside of the classroom and conference halls, professionals from all across an organization must get experience actually working on AI projects and understanding what they can do, and how the technology can push a business forward. AI is a relatively new technology and proper training and education takes time, so be patient, and be sure that you have a strategy in place to onboard your entire team to unlock the benefits of AI for your customers and business.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://aiinhealthcare.substack.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share AI in Healthcare&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://aiinhealthcare.substack.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share AI in Healthcare</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Three Insights From Google's 'Failed' Field Test To Use AI For Medical Diagnosis]]></title><description><![CDATA[Google Research field test reveals challenges in deploying medical AI: distinguishing research from engineering, the need for a holistic approach, and generalization issues in healthcare AI models.]]></description><link>https://www.talby.com/p/three-insights-from-googles-failed</link><guid isPermaLink="false">https://www.talby.com/p/three-insights-from-googles-failed</guid><dc:creator><![CDATA[David Talby]]></dc:creator><pubDate>Tue, 09 Jun 2020 05:05:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!zzvT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F678543b8-7115-4837-a4d6-691eb88351ce.avif" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zzvT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F678543b8-7115-4837-a4d6-691eb88351ce.avif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zzvT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F678543b8-7115-4837-a4d6-691eb88351ce.avif 424w, https://substackcdn.com/image/fetch/$s_!zzvT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F678543b8-7115-4837-a4d6-691eb88351ce.avif 848w, https://substackcdn.com/image/fetch/$s_!zzvT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F678543b8-7115-4837-a4d6-691eb88351ce.avif 1272w, https://substackcdn.com/image/fetch/$s_!zzvT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F678543b8-7115-4837-a4d6-691eb88351ce.avif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zzvT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F678543b8-7115-4837-a4d6-691eb88351ce.avif" width="626" height="469" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/678543b8-7115-4837-a4d6-691eb88351ce.avif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:469,&quot;width&quot;:626,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:11488,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/avif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zzvT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F678543b8-7115-4837-a4d6-691eb88351ce.avif 424w, https://substackcdn.com/image/fetch/$s_!zzvT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F678543b8-7115-4837-a4d6-691eb88351ce.avif 848w, https://substackcdn.com/image/fetch/$s_!zzvT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F678543b8-7115-4837-a4d6-691eb88351ce.avif 1272w, https://substackcdn.com/image/fetch/$s_!zzvT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F678543b8-7115-4837-a4d6-691eb88351ce.avif 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Last month, a team from Google Research published a paper on the results of a field test of a novel deep-learning model to detect diabetic retinopathy from images of patients' eyes. The&nbsp;<a href="https://dl.acm.org/doi/pdf/10.1145/3313831.3376718">paper</a>, titled "A Human-Centered Evaluation of a Deep Learning System Deployed in Clinics for the Detection of Diabetic Retinopathy," is based on research done in partnership with the Ministry of Public Health in Thailand to conduct field research in 11 rural clinics across the provinces of Pathum Thani and Chiang Mai.</p><p>TechCrunch wasted no time in&nbsp;<a href="https://techcrunch.com/2020/04/27/google-medical-researchers-humbled-when-ai-screening-tool-falls-short-in-real-life-testing/">summarizing</a>&nbsp;the study: "Google medical researchers humbled when AI screening tool falls short in real-life testing."</p><p>The article goes on to summarize the failures of the system in practice &#8212; from the lack of dedicated screening rooms that could be darkened to take high-quality images, to inconsistent broadband connectivity, to patients' concerns about having to follow up at a hospital. But I believe this coverage misses the mark in three important aspects, which should be of prime concern to people actually working to deploy medical AI in the field.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.talby.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.talby.com/subscribe?"><span>Subscribe now</span></a></p><p></p><h2><strong>Research Is Not Engineering</strong></h2><p>First, there is a difference between research and engineering, and research studies like this one should be heralded for the progress they enable. According to&nbsp;<a href="https://twitter.com/GoogleHealth/status/1254098126660603904">Google Health</a>, "This is one of the first published studies examining how a deep learning system is used in patient care." We need more studies about medical AI deployments published at the Conference of Human Factors in Computing Systems &#8212; and these studies need to describe things the way they are. Unlike a startup going to market that must spin whatever happens as a success story, research work is only about uncovering the truth.</p><p>Implying such studies are failures not only misrepresents their goal and achievement but also contributes to the issues of nonreproducible research and "science by press release" that plague today's science. If you're one of the many people trying to apply deep learning for medical imaging in practice, then you'll find this paper a gem.</p><p></p><h2><strong>AI Success = Science + Engineering + Process Change</strong></h2><p>Second, there must be an understanding of what it takes to get an AI system from idea to production. Assuming that a basic scientific breakthrough makes a system ready for wide use would have caused the invention of the steam engine to receive press coverage like this: "Scientists humbled to find we're nowhere near a robust national railway system." This is how&nbsp;<a href="https://pilcrow.squarespace.com/stories/car-hatred">cars were originally covered in the media</a>, so there's nothing new under the sun with this happening again with AI.</p><p>Taking on the analogy of cars, here are the three workstreams that must come together for medical AI systems to become an effective everyday reality:</p><p><strong>1. Science:&nbsp;</strong>We need to develop highly accurate data science algorithms for specific problems, as Google did with its original&nbsp;<a href="https://ai.googleblog.com/2016/11/deep-learning-for-detection-of-diabetic.html">deep learning models for detecting diabetic retinopathy</a>. In the analogy to cars, this would be like the invention of the internal combustion engine.</p><p><strong>2. Engineering:</strong>&nbsp;We need to develop ways to productize these inventions at high quality, high scale, safely and cheaply. In the analogy to cars, we need to invent the equivalents of the mass production line, hand brakes, electric starters, air conditioners, airbags and headrests. In the AI space, think MLOps, explainability, bias detection and model governance (as a start). This is the area of the ecosystem where I personally work and specialize.</p><p><strong>3. Process change:</strong>&nbsp;We need to develop human-centered processes that enable people to use these innovations effectively and safely. In the analogy to cars, think of splitting the public space between roads and sidewalks, establishing driver licensing, public education, safety standards, and pollution regulation. In medical AI, we've barely started on this, which makes the recent Google field study an important baby step.</p><p>It's important for practitioners to know that real success &#8212; helping real patients, in the field, at scale, and safely &#8212; requires all three of these aspects to work together. It's important for media coverage to educate people about this.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.linkedin.com/groups/7010492/&quot;,&quot;text&quot;:&quot;Join The LinkedIn Group&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.linkedin.com/groups/7010492/"><span>Join The LinkedIn Group</span></a></p><p></p><p></p><h2><strong>Once More With Feeling: Health Care AI Models Do Not Generalize</strong></h2><p>The third insight from this new study is based on the major differences between the 11 clinics that took part in it. The researchers reported major differences between them &#8212; from how the physical rooms at each clinic were laid out to the personalities and backgrounds of the nurses who worked there. As a result, the trained model could not successfully operate in each of these distinct environments.</p><p>This is such a well-known phenomenon in medical AI that it no longer requires academic validation. Medical AI models generally&nbsp;<a href="https://www.forbes.com/sites/forbestechcouncil/2019/04/03/why-machine-learning-models-crash-and-burn-in-production/#f261972f4379">perform poorly</a>&nbsp;across locations. This not only applies to models deployed in Thailand versus Nigeria but also to models deployed in two clinics that are 5 kilometers apart and serve essentially the same population. This happens in both first-world and third-world countries and across just about every medical specialty that's taken the time to&nbsp;<a href="https://scholar.google.com/scholar?q=concept+drift+in+healthcare+ai&amp;hl=en&amp;as_sdt=0&amp;as_vis=1&amp;oi=scholart">measure it</a>.</p><p>As a result: If you have a successfully deployed model in one location (or 10), you do not have an accurate model that's ready for the next clinic. Continuously tuning and monitoring AI models is part of the engineering work underway in the "Science + Engineering + Process Change" trifecta. At this point in time, I expect every sound medical AI field deployment to be addressing this issue.</p><p>Turning medical AI from aspiration into a reality that improves humanity's well-being is going to be a long ride. It will take us all of the first half of the 21st century &#8212; and that's if we're efficient about it. Maybe this isn't original, but it may be the adventure of a generation.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://aiinhealthcare.substack.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share AI in Healthcare&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://aiinhealthcare.substack.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share AI in Healthcare</span></a></p><p></p>]]></content:encoded></item><item><title><![CDATA[Healthcare AI Does Not Need A New Hippocratic Oath]]></title><description><![CDATA[This article explores the historical context of ethics in healthcare technology, highlighting the relevance of an ancient oath and the need for a strong ethical framework in AI.]]></description><link>https://www.talby.com/p/healthcare-ai-does-not-need-a-new</link><guid isPermaLink="false">https://www.talby.com/p/healthcare-ai-does-not-need-a-new</guid><dc:creator><![CDATA[David Talby]]></dc:creator><pubDate>Wed, 20 May 2020 05:01:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Rwz8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1349d55-0cda-4890-89d9-7cce61157364_800x1678.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Rwz8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1349d55-0cda-4890-89d9-7cce61157364_800x1678.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Rwz8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1349d55-0cda-4890-89d9-7cce61157364_800x1678.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Rwz8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1349d55-0cda-4890-89d9-7cce61157364_800x1678.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Rwz8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1349d55-0cda-4890-89d9-7cce61157364_800x1678.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Rwz8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1349d55-0cda-4890-89d9-7cce61157364_800x1678.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Rwz8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1349d55-0cda-4890-89d9-7cce61157364_800x1678.jpeg" width="728" height="1526.98" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c1349d55-0cda-4890-89d9-7cce61157364_800x1678.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;normal&quot;,&quot;height&quot;:1678,&quot;width&quot;:800,&quot;resizeWidth&quot;:728,&quot;bytes&quot;:1043006,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Rwz8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1349d55-0cda-4890-89d9-7cce61157364_800x1678.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Rwz8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1349d55-0cda-4890-89d9-7cce61157364_800x1678.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Rwz8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1349d55-0cda-4890-89d9-7cce61157364_800x1678.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Rwz8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1349d55-0cda-4890-89d9-7cce61157364_800x1678.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Image credit: A fragment of the oath on the 3rd-century <a href="https://en.wikipedia.org/wiki/Oxyrhynchus_Papyri">Papyrus Oxyrhynchus</a> 2547. https://en.wikipedia.org/wiki/Hippocratic_Oath</figcaption></figure></div><p>Ethics in AI is a major issue, especially in response to the role that artificial technology has played in&nbsp;<a href="https://www.theguardian.com/society/2019/may/10/online-hate-against-disabled-people-rises-by-a-third">growing hate crime</a>,&nbsp;<a href="https://www.fast.ai/2017/11/02/ethics/">destabilizing democracy</a>,&nbsp;<a href="https://www.nytimes.com/interactive/2019/06/08/technology/youtube-radical.html">radicalizing youth</a>&nbsp;and&nbsp;<a href="https://www.newscientist.com/article/2166207-discriminating-algorithms-5-times-ai-showed-prejudice/">scaling up discrimination</a>&nbsp;-- all while profiteering from it.</p><p>This is of utmost concern when life-and-death decisions are automated -- as in healthcare. This resulted in broad discussions about the need for&nbsp;<a href="https://www.thersa.org/discover/publications-and-articles/rsa-blogs/2017/02/a-hippocratic-oath-for-ai-developers-it-may-only-be-a-matter-of-time">data scientists to take a form of the Hippocratic oath</a>&nbsp;-- or even that&nbsp;<a href="https://techcrunch.com/2018/03/14/a-hippocratic-oath-for-artificial-intelligence-practitioners/">AI solutions should take it</a>&nbsp;in some form.</p><p>However, if you do work in healthcare, you'll know that ethics in healthcare technology is much further along than in software technology. I challenge you to live by an older oath, which also has the benefit of putting the&nbsp;<a href="https://healthmanagement.org/c/healthmanagement/IssueArticle/ai-is-the-new-reality-the-4th-healthcare-revolution-in-medicine">AI revolution</a>&nbsp;in a much humbler historical context.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.talby.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.talby.com/subscribe?"><span>Subscribe now</span></a></p><p></p><h2><strong>Healthcare Technology: Tale as Old as Time</strong></h2><p>Pharmacy --&nbsp;<a href="http://pharmacy.wsu.edu/documents/2018/01/history-of-the-pharmacy-profession.pdf">practiced since the beginning of civilization</a>&nbsp;and documented since the beginning of writing -- is the first form of healthcare technology. Instead of directly seeing patients, a pharmacist would concoct potions, pills, creams, and devices that patients or physicians would buy. They claimed to relieve pain, get you back on your feet or make you younger, prettier, happier, or sexier.</p><p>The top sellers never change.</p><p>These technologists would concoct whatever remedy they could at home and then bring it to market. There was usually no proof that these products worked at all -- and every financial incentive to lie, exaggerate and distract people from facts. Great marketing -- showmanship, packaging, naming, distribution -- was a blueprint for commercial success throughout history.</p><p>It didn&#8217;t help that until the development and wide adoption of statistics, there was no way to test what really worked. New technology -- whether it was a new medicinal herb or surgical technique -- was always unproven for a long while and always attracted many more charlatans than true believers. We&#8217;re at that point today with digital health.</p><p>These ancestral healthcare technologists did not explain how their products worked because that secret was their livelihood. Often they didn&#8217;t know themselves. Explainability is not an issue that started with deep learning -- it started before the alphabet.</p><p>There was never a time in history where ethics in healthcare technology was not an ongoing public concern. Bad actors kept selling, and good people kept buying -- because of the placebo effect, because there&#8217;s a new sucker born every minute or because the sick just wanted a glimmer of hope.</p><p></p><h2><strong>Pharmacy And The Oath Of Maimonides</strong></h2><p>Pharmacists -- those in charge of the safe application of healthcare technology, from&nbsp;<a href="https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3358962/">medicinal herbs in Sumer</a>&nbsp;5000 years ago to&nbsp;<a href="https://www.mobihealthnews.com/news/prescribing-right-apps-right-patients">prescribing the right app to the right patient</a>&nbsp;today -- already have a strong ethical framework in place.</p><p>The most widely used oath is the&nbsp;<a href="https://dal.ca.libguides.com/c.php?g=256990&amp;p=1717827">Oath of Maimonides</a>. Attributed to the 12th-century rabbi, physician, and philosopher of its namesake and published in Germany in 1783, it is surprisingly appropriate for healthcare AI practitioners today. For those unfamiliar with the oath, here is a snippet from the full text:</p><p>"May the love for my art actuate me at all time; may neither avarice nor miserliness nor thirst for glory or for a great reputation engage my mind; for the enemies of truth and philanthropy could easily deceive me and make me forgetful of my lofty aim of doing good to thy children.</p><p>"May I never see in the patient anything but a fellow creature in pain?"</p><p>"Grant me the strength, time, and opportunity always to correct what I have acquired, always to extend its domain; for knowledge is immense and the spirit of man can extend indefinitely to enrich itself daily with new requirements."</p><p>"Today he can discover his errors of yesterday and tomorrow he can obtain a new light on what he thinks himself sure of today."</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.linkedin.com/groups/7010492/&quot;,&quot;text&quot;:&quot;Join The LinkedIn Group&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.linkedin.com/groups/7010492/"><span>Join The LinkedIn Group</span></a></p><p></p><h2><strong>The Way Things Are</strong></h2><p>As an industry, we are far away from living by such an oath. Consider the distance between never seeing anything but a fellow creature in pain and these statements:</p><ul><li><p>In a fee-for-service setting, a patient walking into the hospital is a revenue stream.</p></li><li><p>In an accountable care (<a href="https://khn.org/news/aco-accountable-care-organization-faq/">ACO</a>) setting, the same patient walking in is a cost center.</p></li><li><p>Payers are the true customers of a healthcare provider -- they are the ones who pay.</p></li><li><p>The terms "patient" and "healthcare consumer" can be used&nbsp;<a href="https://www.beckershospitalreview.com/healthcare-information-technology/consumers-vs-patients-healthcare-s-biggest-misunderstanding.html">interchangeably</a>.</p></li><li><p>The quadruple aim of healthcare balances trade-offs between improving the patient experience, improving population health, reducing the cost of care, and improving clinical staff satisfaction.</p></li></ul><p>This is, unfortunately, how U.S. healthcare speaks in the early 21st century. It&#8217;s so pervasive that it was impossible for me to pick citation links here -- just search for top-of-mind statements from executives and leaders of provider groups, payers, startups, health IT companies, pharma companies, and everyone in between. It&#8217;s a business, first and foremost. Love of the art is second to making money or fame, and for every&nbsp;<a href="https://www.investopedia.com/articles/investing/020116/theranos-fallen-unicorn.asp">Theranos</a>&nbsp;that gets caught, many more keep going.</p><p>There is nothing new under the sun.</p><p>The good news is that as a data scientist, you do not need to wait for an ethical framework or for widely accepted solutions to AI bias, explainability, or plain misuse. Simply build to help patients and humbly acknowledge the limits of what you build. Keep learning. Join a team that does not require you to make unethical trade-offs.</p><p>Not there yet? Today you can discover the errors of yesterday, and tomorrow you can obtain a new light.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://aiinhealthcare.substack.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&quot;,&quot;text&quot;:&quot;Share AI in Healthcare&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://aiinhealthcare.substack.com/?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share"><span>Share AI in Healthcare</span></a></p><p></p>]]></content:encoded></item></channel></rss>