🛠️selling probabilistic program to implement Entrepreneurship #224
Replies: 18 comments 15 replies
-
1. inference controller, path finding sequential MC algorithmsFact
Angie's Belief
Angie's actionsharing modularized prior and simulated re-action, asking for verification + further action modularized priorsimulate re-action to prompt
baseline controller might represent a standard approach to robot navigation, using observed data to infer the robot's position and move toward the goal. In contrast, the robust controller introduces particle filtering, a method that uses a set of "particles" to represent the distribution of possible states, to improve the agent's beliefs about its location, particularly when uncertainties or changes in the environment occur. concept of punctuated equilibrium from evolutionary biology—and by extension, evolutionary psychology—can offer intriguing insights into the processes of adaptation and innovation, such as those found in startup pivots or the development of robust controllers for localization algorithms. Punctuated equilibrium posits that evolutionary development is characterized by long periods of stability (equilibrium) interrupted by short, sudden changes (punctuations). In the context of startups, this can parallel the periods of steady growth or consistent business models that are periodically disrupted by significant shifts or 'pivots' in response to market changes, technological advancements, or new information. Applying this to your friend George's situation with his localization algorithm, the 'Kidnap the robot' animation might metaphorically represent a startup's unexpected shift in the market or a sudden realization that the current business model isn't viable—a 'kidnapping' of the startup's trajectory. George's difficulty with developing an inference controller could be akin to a startup struggling to find a new direction after such a disruption. The process of developing a robust controller could benefit from evolutionary principles in the following ways:
By drawing parallels between these evolutionary mechanisms and the challenges George faces, one can suggest that a more dynamic, adaptive, and resilient approach to algorithm development may emerge. This perspective might not only aid in refining his inference controller but also serve as an inspiration for startups navigating their pivots. |
Beta Was this translation helpful? Give feedback.
-
1. population modelJaao, Angie crosscat: https://jmlr.org/papers/volume17/11-392/11-392.pdf, http://probcomp.csail.mit.edu/software/crosscat/ |
Beta Was this translation helpful? Give feedback.
-
|
to answer charlie's question in halfway persuasion on PC(BE).txt, i prepared three summaries of vikash's talk: ted talk on AI That Understands the World, Using Probabilistic Programmingsummary: Probabilistic programs provide a new symbolic language for expressing uncertain knowledge about possible worlds and the processes to infer them. It's a new medium for knowledge representation. Examples demonstrate probabilistic programming outperforming machine learning systems on tasks such as perceiving 3D structure, reducing errors in perception (compared to Tesla's neural networks), cleaning millions of database records from US Medicare, and forecasting econometric time series. These probabilistic programs seem to understand the world more like humans do, in terms of symbolic representations. answer to your questions on its novelty + success case: Probabilistic Programming Tutorial Part 1, 2Part I
Part II |
Beta Was this translation helpful? Give feedback.
-
🙋♀️ queryQ1. among three query language Q2. what is the family tree of the three language? if it is BQL to IQL to GSQL what "need" for feature prompted birth of the latter ones? how would this be relevant to startup pivoting decisions? 👨🏽🏫 predicted answerbelow is short comparison of three languages from supply and demand's side (historic order) Supply Side Table
Demand Side Table
|
Beta Was this translation helpful? Give feedback.
-
|
casestudy building with prob.comp(matin and mattieu) on medtech company (medical device with >10times accurate imaging) that aims to find win win strategy that satisfies both ceo, vc, employee |
Beta Was this translation helpful? Give feedback.
-
2. reverse engineering natural intelligence-- |
Beta Was this translation helpful? Give feedback.
-
|
meeting with zane. planning business is hard to quantify (adjustment from engineering to business focused)
presenting (prior beliefs on latent values; ability to input) |
Beta Was this translation helpful? Give feedback.
-
3. posterior space sampling algorithmsdifferent sampling algorithms that has implications for entrepreneurial learning (claude optimal learning modes)
|
Beta Was this translation helpful? Give feedback.
-
|
@chasfine asked definitions of three apis/architecture of vikash (prob.comp) so I guessed below to get @mugamma's verification
tl;dr
1. Modular learning from human-scale data:This refers to a learning approach that breaks down complex problems into smaller, manageable components (modules) and learns from data that is comparable in scale to what humans typically encounter. In the context of probabilistic programming, this involves: a) Creating modular probabilistic programs that can be combined and reused across different tasks. This approach contrasts with traditional machine learning methods that often require massive datasets and monolithic models. 2. AI-human comparisons in comparable worlds:This concept involves creating simulated environments or "worlds" where both AI systems and humans can be evaluated on equal footing. In these comparable worlds: a) We define tasks and scenarios that are meaningful for both AI and humans. This approach allows us to make fair comparisons and gain insights into how AI systems can be designed to complement human intelligence. 3. World modeling and decision making:This refers to the process of creating computational models of the world (or specific domains within it) and using these models to make decisions. In the context of probabilistic programming: a) World modeling involves creating probabilistic programs that capture the causal structure, uncertainties, and dynamics of a given environment or problem domain.
d) The probabilistic nature of these models allows for reasoning about uncertainty in both the world state and the outcomes of actions. This approach to world modeling and decision making aims to capture the richness and uncertainty of real-world scenarios, allowing for more robust and adaptable AI systems that can reason in ways more similar to human cognition. cld.Leveraging Probabilistic AI for Entrepreneurial Decision-Making |
Beta Was this translation helpful? Give feedback.
-
|
#249
|
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
|
how gensql is framed Stacked squares in techy space GenSQL, a generative AI system for databases, could help users make predictions, detect anomalies, guess missing values, fix errors, or generate synthetic data with just a few keystrokes. For instance, if the system were used to analyze medical data from a patient who has always had high blood pressure, it could catch a blood pressure reading that is low for that particular patient but would otherwise be in the normal range. GenSQL automatically integrates a tabular dataset and a generative probabilistic AI model, which can account for uncertainty and adjust their decision-making based on new data. Moreover, GenSQL can be used to produce and analyze synthetic data that mimic the real data in a database. This could be especially useful in situations where sensitive data cannot be shared, such as patient health records, or when real data are sparse. This new tool is built on top of SQL, a programming language for database creation and manipulation that was introduced in the late 1970s and is used by millions of developers worldwide. “Historically, SQL taught the business world what a computer could do. They didn’t have to write custom programs, they just had to ask questions of a database in high-level language. We think that, when we move from just querying data to asking questions of models and data, we are going to need an analogous language that teaches people the coherent questions you can ask a computer that has a probabilistic model of the data,” says Vikash Mansinghka ’05, MEng ’09, PhD ’09, senior author of a paper introducing GenSQL and a principal research scientist and leader of the Probabilistic Computing Project in the MIT Department of Brain and Cognitive Sciences. When the researchers compared GenSQL to popular, AI-based approaches for data analysis, they found that it was not only faster but also produced more accurate results. Importantly, the probabilistic models used by GenSQL are explainable, so users can read and edit them. “Looking at the data and trying to find some meaningful patterns by just using some simple statistical rules might miss important interactions. You really want to capture the correlations and the dependencies of the variables, which can be quite complicated, in a model. With GenSQL, we want to enable a large set of users to query their data and their model without having to know all the details,” adds lead author Mathieu Huot, a research scientist in the Department of Brain and Cognitive Sciences and member of the Probabilistic Computing Project. They are joined on the paper by Matin Ghavami and Alexander Lew, MIT graduate students; Cameron Freer, a research scientist; Ulrich Schaechtle and Zane Shelby of Digital Garage; Martin Rinard, an MIT professor in the Department of Electrical Engineering and Computer Science and member of the Computer Science and Artificial Intelligence Laboratory (CSAIL); and Feras Saad ’15, MEng ’16, PhD ’22, an assistant professor at Carnegie Mellon University. The research was recently presented at the ACM Conference on Programming Language Design and Implementation. Combining models and databases SQL, which stands for structured query language, is a programming language for storing and manipulating information in a database. In SQL, people can ask questions about data using keywords, such as by summing, filtering, or grouping database records. However, querying a model can provide deeper insights, since models can capture what data imply for an individual. For instance, a female developer who wonders if she is underpaid is likely more interested in what salary data mean for her individually than in trends from database records. The researchers noticed that SQL didn’t provide an effective way to incorporate probabilistic AI models, but at the same time, approaches that use probabilistic models to make inferences didn’t support complex database queries. They built GenSQL to fill this gap, enabling someone to query both a dataset and a probabilistic model using a straightforward yet powerful formal programming language. A GenSQL user uploads their data and probabilistic model, which the system automatically integrates. Then, she can run queries on data that also get input from the probabilistic model running behind the scenes. This not only enables more complex queries but can also provide more accurate answers. For instance, a query in GenSQL might be something like, “How likely is it that a developer from Seattle knows the programming language Rust?” Just looking at a correlation between columns in a database might miss subtle dependencies. Incorporating a probabilistic model can capture more complex interactions. Plus, the probabilistic models GenSQL utilizes are auditable, so people can see which data the model uses for decision-making. In addition, these models provide measures of calibrated uncertainty along with each answer. For instance, with this calibrated uncertainty, if one queries the model for predicted outcomes of different cancer treatments for a patient from a minority group that is underrepresented in the dataset, GenSQL would tell the user that it is uncertain, and how uncertain it is, rather than overconfidently advocating for the wrong treatment. Faster and more accurate results To evaluate GenSQL, the researchers compared their system to popular baseline methods that use neural networks. GenSQL was between 1.7 and 6.8 times faster than these approaches, executing most queries in a few milliseconds while providing more accurate results. They also applied GenSQL in two case studies: one in which the system identified mislabeled clinical trial data and the other in which it generated accurate synthetic data that captured complex relationships in genomics. Next, the researchers want to apply GenSQL more broadly to conduct largescale modeling of human populations. With GenSQL, they can generate synthetic data to draw inferences about things like health and salary while controlling what information is used in the analysis. They also want to make GenSQL easier to use and more powerful by adding new optimizations and automation to the system. In the long run, the researchers want to enable users to make natural language queries in GenSQL. Their goal is to eventually develop a ChatGPT-like AI expert one could talk to about any database, which grounds its answers using GenSQL queries. https://www.techbriefs.com/component/content/article/51167-meet-gensql-generative-ai-for-databases GenSQL, a generative AI system for databases, could help users make predictions, detect anomalies, guess missing values, fix errors, or generate synthetic data with just a few keystrokes. For instance, if the system were used to analyze medical data from a patient who has always had high blood pressure, it could catch a blood pressure reading that is low for that particular patient but would otherwise be in the normal range. GenSQL automatically integrates a tabular dataset and a generative probabilistic AI model, which can account for uncertainty and adjust their decision-making based on new data. Moreover, GenSQL can be used to produce and analyze synthetic data that mimic the real data in a database. This could be especially useful in situations where sensitive data cannot be shared, such as patient health records, or when real data are sparse. This new tool is built on top of SQL, a programming language for database creation and manipulation that was introduced in the late 1970s and is used by millions of developers worldwide. “Historically, SQL taught the business world what a computer could do. They didn’t have to write custom programs, they just had to ask questions of a database in high-level language. We think that, when we move from just querying data to asking questions of models and data, we are going to need an analogous language that teaches people the coherent questions you can ask a computer that has a probabilistic model of the data,” said Senior Author Vikash Mansinghka, Ph.D., Principal Research Scientist and leader of the Probabilistic Computing Project in the MIT Department of Brain and Cognitive Sciences. When the researchers compared GenSQL to popular AI-based approaches for data analysis, they found that it was not only faster but also produced more accurate results. Importantly, the probabilistic models used by GenSQL are explainable, so users can read and edit them. “Looking at the data and trying to find some meaningful patterns by just using some simple statistical rules might miss important interactions. You really want to capture the correlations and the dependencies of the variables, which can be quite complicated, in a model. With GenSQL, we want to enable a large set of users to query their data and their model without having to know all the details,” said Lead Author Mathieu Huot, Research Scientist in the Department of Brain and Cognitive Sciences and member of the Probabilistic Computing Project. Here is an exclusive Tech Briefs interview, edited for length and clarity, with Huot. Tech Briefs: What was the biggest technical challenge you faced while developing Gen SQL? Huot: I think being able to compactly represent a whole range of problems you want to solve in a way that's going to be accessible — how do you not develop 20 different tools and sets of ideas that people can reuse. The point was to be concise and accessible to a large audience. So, how do you make that possible and unify all these different kind of questions — that was definitely a challenge. Tech Briefs: Can you explain in simple terms how it works? Huot: You usually have a model; if not, you can have one automatically synthesized for you, given some data. And then given the data — which is either what you train your model on or a different set; but the data of interest that you want to analyze and your model — you can type questions in this formal language. But now we also have some prototype language integration where you can just write in English and it would translate to the formal language. Then, your query is run in the formal language, and you get specialized answers, which can be tables, numbers, etc. Tech Briefs: How did this work come about? What was the catalyst for the project? Huot: We've had, in our team, a history of looking at data science and statistics. So, people were aware of the different challenges in applying existing tools to certain problems. Either it was too specialized, or the automation wasn't good, or the model needed too much expertise. So, there was a growing desire in the team to produce something more accessible. Maybe having a hybrid team of some people being more interested in applications and others thinking about language design was the key to utilizing this opportunity to create something new that can solve a problem by being more accessible to data scientists. Related Articles Tech Focus: Leveraging Generative AI for Industrial Environment Tech Briefs: What are your next steps? Do you have plans for further research, work, etc. to accomplish these goals? Huot: There are several technical challenges, which I won't go into the details of. But one challenge is how to scale these things up: how to build a better backend system that can create richer and faster models to query. That's one part. Another is making it more accessible, for example, through language modeling. So, you speak or write in English, then we can convert it to a formal query. But in the process, sometimes you find ambiguities in the language. One example is, ‘Show me all the people, who are Democratic voters who voted one way in the U.S. in a certain year. But those voters might be registered in that party or just identify as being in that party. So, it can be ambiguous, depending on the dataset. And you may not know that up front. Here, the system could show you two queries depending on which one you are actually asking for. On a larger scale, we have some people looking at how to apply these tools to build models for prostate cancer data. Here the challenge is that many experiments are on a small scale, like a single hospital and they can't really share the data. So, if you train a model on that, it's not really telling you enough to make many predictions. But there are a lot of these experiments on the internet that are freely accessible. They're usually not super clean or they have different conventions; you can't really train a model by magic on all of these models and then solve all your problems. But, these tools that we develop can help. We’re pushing to make the tools able to construct a model that's going to look at, for example, two different clinical trials where the roles don't exactly match. One may have people grouped by age, another by category, how do people get assigned to a group, they might have different labels. So, automating data harmonization. Tech Briefs: Do you have any advice for researchers or engineers aiming to bring their ideas to fruition, broadly speaking? Huot: I think it's great — if you want to create something to apply to a problem — to clearly know what the problem is. It may sound extremely simple, but it's really hard in research. You have to spend, for example, six months in the mud to really get to know the problem and what people actually want instead of what you think they want. That can be really hard to do. Usually, it's going take way longer to solve that problem than you might think; creating these tools takes years. Start from the bottom, be humble. |
Beta Was this translation helpful? Give feedback.
-
|
scott recommended to delve into "Probabilistic programs are more expressive and flexible than Bayesian networks, allowing for more complex modeling" which i started in Probabilistic Programs vs. Bayesian Networks for Modeling Startup Growth cld Bayesian Network Approach: Continuous growth metrics |
Beta Was this translation helpful? Give feedback.
-
5. System dynamics model library as world model to implement rational meaning construction for entrepreneurship@tomfid josh from #250 spent two weeks arguing limitations of graphical model representation (bayesnet) and neural network for inference. I'll follow up on inference this weekend, but could you give confirmation that based on below distinction, SD is also probabilistic program? causal bayes net vs probabilistic program
The table explains differences in representation, expressiveness, capabilities of these two approaches to probabilistic modeling and reasoning. |
Beta Was this translation helpful? Give feedback.
-
5. Programmable Automated Inference: Risk vs Reward tradeoff in user space between robustness and generalityvkm_lec_note12.pdf (d): How can learning and inference proceed efficiently and accurately, even with very complex hypothesis spaces? Main takeaway for today: Risk vs reward tradeoff user's space (affordance) + three benefits of automation (correctness, speed of implementation, performance). reverse engineering cognitive component enables "rational risk and reward tradeoff in user space on robustness, automation, generality dimension" when i asked, the meaning of "app level scaling is empirical” from transformers", mathieu and matin replied:
|
Beta Was this translation helpful? Give feedback.
-
|
@chasfine I find below vikash's slide (from last summer school) to be useful overview of #249 |
Beta Was this translation helpful? Give feedback.
-
|
vikash's class on scaling AI
Syllabus for 9.S916 Spec Subj Brain & Cog Sciences.pdf after sitting in two classes by vikash in 2024 F, charlie shared his abstracted understanding of vikash's lecture which i always thought about but haven't verbalized: "you can take these mathematical statements, turn them into computer statements, have the computer execute the mathematical models, and you can do that with bringing to bear a lot of computational power to complicated questions" charlie thinks infinite state space is the biggest obstacle in applying prob.prog to entrepreneurship |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.




















Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
target audience: BE evaluators
for PC evaluators, i summarized bayes.entrep in sister thread #234
this thread's purpose is two:
to put scalable auto-modeling using Bayesian synthesis and domain specific language on my Bayes in Business supporters' radar 📡
to share my willingness to bet to prob.comp with Vikash and seek path to officially join prob.comp project
format: Bob Horn's information mural with more structured as 𝌭
designingand image sequence as 🎞️filming. collaborated output of designer-value (𝌭), artist-vision (🎞️), scientist-understand (🔭), engineer-solve (⚙️) is architecture🏛️.keywords: domain specific language claude, inference controller, reverse engineering natural intelligence, population model, posterior sampling algorithms
contents
Inference controller, SMC algorithms (a): How does abstract knowledge guide learning and inference from sparse data? (e): How can probabilistic inferences be used to drive action?
Reverse engineering natural intelligence (b): What form does that knowledge take, across different domains and tasks?
Larger population model (c): How is that knowledge itself constructed?
Posterior space sampling algorithms (d): How can learning and inference proceed efficiently and accurately, even with very complex hypothesis spaces?
Programmable Automated Inference: Risk vs Reward tradeoff in user space between robustness and generality (d): How can learning and inference proceed efficiently and accurately, even with very complex hypothesis spaces?
(e): How can probabilistic inferences be used to drive action?
papers
Gen Tutorial
angie's action
show Vikash this demo which is
stan-based startup pivot simulation that tests existing hypothesis in entrepreneurship literature and align with prob.comp's vision to seek path in proceeding the project to develop startup education and prediction tool under computer human interaction expert's umbrellaplan developing education material with Academy of Management conference Professional Development Workshops team (led by Andreas Schwab) e.g. gen-finance gen-finance #177
model entrepreneurial growth IAI. innovative augmented intelligence 🤖 #174 (comment)
Beta Was this translation helpful? Give feedback.
All reactions