805 Columbus Avenue
513 Interdisciplinary Science & Engineering Complex (ISEC)
Boston, MA 02120
ATTN: Lawson Wong, 5th Floor ISEC
360 Huntington Avenue
Boston, MA 02115-5000
- PhD, Massachusetts Institute of Technology
Lawson L.S. Wong is an assistant professor in the Khoury College of Computer Sciences at Northeastern University. His research focuses on learning, representing, and estimating knowledge about the world that an autonomous robot may find useful.
Prior to Northeastern, Lawson was a postdoctoral fellow at Brown University. He completed his PhD at the Massachusetts Institute of Technology. He has received a Siebel Fellowship, AAAI Robotics Student Fellowship, and Croucher Foundation Fellowship for Postdoctoral Research.
Abel, D., Arumugam, D., Asadi, K., Jinnai, Y., Littman, M. L., & Wong, L. L. (2019). State Abstraction as Compression in Apprenticeship Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 33(01), 3134-3142. https://doi.org/10.1609/aaai.v33i01.33013134
State abstraction can give rise to models of environments that are both compressed and useful, thereby enabling efficient sequential decision making. In this work, we offer the first formalism and analysis of the trade-off between compression and performance made in the context of state abstraction for Apprenticeship Learning. We build on Rate-Distortion theory, the classic Blahut-Arimoto algorithm, and the Information Bottleneck method to develop an algorithm for computing state abstractions that approximate the optimal tradeoff between compression and performance. We illustrate the power of this algorithmic structure to offer insights into effective abstraction, compression, and reinforcement learning through a mixture of analysis, visuals, and experimentation.
Grounding Natural Language Instructions to Semantic Goal Representations for Abstraction and Generalization
Arumugam, D., Karamcheti, S., Gopalan, N. et al. Auton Robot (2019) 43: 449. https://doi.org/10.1007/s10514-018-9792-8
Language grounding is broadly defined as the problem of mapping natural language instructions to robot behavior. To truly be effective, these language grounding systems must be accurate in their selection of behavior, efficient in the robot’s realization of that selected behavior, and capable of generalizing beyond commands and environment configurations only seen at training time. One choice that is crucial to the success of a language grounding model is the choice of representation used to capture the objective specified by the input command. Prior work has been varied in its use of explicit goal representations, with some approaches lacking a representation altogether, resulting in models that infer whole sequences of robot actions, while other approaches map to carefully constructed logical form representations. While many of the models in either category are reasonably accurate, they fail to offer either efficient execution or any generalization without requiring a large amount of manual specification. In this work, we take a first step towards language grounding models that excel across accuracy, efficiency, and generalization through the construction of simple, semantic goal representations within Markov decision processes. We propose two related semantic goal representations that take advantage of the hierarchical structure of tasks and the compositional nature of language respectively, and present multiple grounding models for each. We validate these ideas empirically with results collected from following text instructions within a simulated mobile-manipulator domain, as well as demonstrations of a physical robot responding to spoken instructions in real time. Our grounding models tie abstraction in language commands to a hierarchical planner for the robot’s execution, enabling a response-time speed-up of several orders of magnitude over baseline planners within sufficiently large domains. Concurrently, our grounding models for generalization infer elements of the semantic representation that are subsequently combined to form a complete goal description, enabling the interpretation of commands involving novel combinations never seen during training. Taken together, our results show that the design of semantic goal representation has powerful implications for the accuracy, efficiency, and generalization capabilities of language grounding models.
Wandzel, Arthur, Yoonseon Oh, Michael Fishman, Nishanth Kumar, Lawson L. S. Wong and Stefanie Tellex. “Multi-Object Search using Object-Oriented POMDPs.” (2019).
A core capability of robots is to reason about multiple objects under uncertainty. Partially Observable Markov Decision Processes (POMDPs) provide a means of reasoning under uncertainty for sequential decision making, but are computationally intractable in large domains. In this paper, we propose Object-Oriented POMDPs (OO-POMDPs), which represent the state and observation spaces in terms of classes and objects. The structure afforded by OO-POMDPs support a factorization of the agent’s belief into independent object distributions, which enables the size of the belief to scale linearly versus exponentially in the number of objects. We formulate a novel Multi-Object Search (MOS) task as an OO-POMDP for mobile robotics domains in which the agent must find the locations of multiple objects. Our solution exploits the structure of OO-POMDPs by featuring human language to selectively update the belief at task onset. Using this structure, we develop a new algorithm for efficiently solving OO-POMDPs: Object-Oriented Partially Observable Monte-Carlo Planning (OO-POMCP). We show that OO-POMCP with grounded language commands is sufficient for solving challenging MOS tasks both in simulation and on a physical mobile robot.
Nakul Gopalan, Dilip Arumugam, Lawson L.S. Wong, Stefanie Tellex. Sequence-to-sequence language grounding of non-Markovian task specifications. In Robotics: Science and Systems (RSS), 2018.
Often times, natural language commands issued to robots not only specify a particular target configuration or goal state but also outline constraints on how the robot goes about its execution. That is, the path taken to achieving some goal state is given equal importance to the goal state itself. One example of this could be instructing a wheeled robot to go to the living room but avoid the kitchen, in order to avoid scuffing the floor. This class of behaviors poses a serious obstacle to existing language understanding for robotics approaches that map to either action sequences or goal state representations. Due to the non-Markovian nature of the objective, approaches in the former category must map to potentially unbounded action sequences whereas approaches in the latter category would require folding the entirety of a robot’s trajectory into a (traditionally Markovian) state representation, resulting in an intractable decision-making problem. To resolve this challenge, we use a recently introduced probabilistic variant of Linear Temporal Logic (LTL) as a goal specification language for a Markov Decision Process (MDP). While demonstrating that standard neural sequence-to-sequence learning models can successfully ground language to this semantic representation, we also provide analysis that highlights generalization to novel, unseen logical forms as an open problem for this class of model. We evaluate our system within two simulated robot domains as well as on a physical robot, demonstrating accurate language grounding alongside a significant expansion in the space of interpretable robot behaviors.
Lawson L.S. Wong, Leslie Pack Kaelbling, Tomás Lozano-Pérez. Data association for semantic world modeling from partial views. International Journal of Robotics Research (IJRR), 34(7):1064-1082, 2015.
Autonomous mobile-manipulation robots need to sense and interact with objects to accomplish high-level tasks such as preparing meals and searching for objects. To achieve such tasks, robots need semantic world models, defined as object-based representations of the world involving task-level attributes. In this work, we address the problem of estimating world models from semantic perception modules that provide noisy observations of attributes. Because attribute detections are sparse, ambiguous, and are aggregated across different viewpoints, it is unclear which attribute measurements are produced by the same object, so data association issues are prevalent. We present novel clustering-based approaches to this problem, which are more efficient and require less severe approximations compared with existing tracking-based approaches. These approaches are applied to data containing object type-and-pose detections from multiple viewpoints, and demonstrate comparable quality using a fraction of the computation time.
Dilip Arumugam, Siddharth Karamcheti, Nakul Gopalan, Lawson L.S. Wong, Stefanie Tellex. Accurately and efficiently interpreting human-robot instructions of varying granularities. In Robotics: Science and Systems (RSS), 2017.
Humans can ground natural language commands to tasks at both abstract and fine-grained levels of specificity. For instance, a human forklift operator can be instructed to perform a high-level action, like ‘grab a pallet’ or a low-level action like ’tilt back a little bit.’ While robots are also capable of grounding language commands to tasks, previous methods implicitly assume that all commands and tasks reside at a single, fixed level of abstraction. Additionally, methods that do not use multiple levels of abstraction encounter inefficient planning and execution times as they solve tasks at a single level of abstraction with large, intractable state-action spaces closely resembling real world complexity. In this work, by grounding commands to all the tasks or subtasks available in a hierarchical planning framework, we arrive at a model capable of interpreting language at multiple levels of specificity ranging from coarse to more granular. We show that the accuracy of the grounding procedure is improved when simultaneously inferring the degree of abstraction in language used to communicate the task. Leveraging hierarchy also improves efficiency: our proposed approach enables a robot to respond to a command within one second on 90% of our tasks, while baselines take over twenty seconds on half the tasks. Finally, we demonstrate that a real, physical robot can ground commands at multiple levels of abstraction allowing it to efficiently plan different subtasks within the same planning hierarchy.
Nakul Gopalan, Marie desJardins, Michael L. Littman, James MacGlashan, Shawn Squire, Stefanie Tellex, John Winder, Lawson L.S. Wong. Planning with abstract Markov decision processes. In International Conference on Automated Planning and Scheduling (ICAPS), 2017.
David Whitney, Eric Rosen, James MacGlashan, Lawson L.S. Wong, Stefanie Tellex. Reducing errors in object-fetching interactions through social feedback. In IEEE International Conference on Robotics and Automation (ICRA), 2017.
Lawson L.S. Wong, Thanard Kurutach, Tomás Lozano-Pérez, Leslie Pack Kaelbling. Object-based world modeling in semi-static envrionments with dependent Dirichlet process mixtures. In International Joint Conference on Artificial Intelligence (IJCAI), 2016.
To accomplish tasks in human-centric indoor environments, agents need to represent and understand the world in terms of objects and their attributes. We consider how to acquire such a world model via noisy perception and maintain it over time, as objects are added, changed, and removed in the world. Previous work framed this as multiple-target tracking problem, where objects are potentially in motion at all times. Although this approach is general, it is computationally expensive. We argue that such generality is not needed in typical world modeling tasks, where objects only change state occasionally. More efficient approaches are enabled by restricting ourselves to such semi-static environments. We consider a previously-proposed clustering-based world modeling approach that assumed static environments, and extend it to semi-static domains by applying a dependent Dirichlet process (DDP) mixture model. We derive a novel MAP inference algorithm under this model, subject to data association constraints. We demonstrate our approach improves computational performance for world modeling in semi-static environments.