Wednesday, September 28, 2011

Paper Reading #13: Combining Multiple Depth Cameras and Projectors for Interactions On, Above, and Between Surfaces

Combining Multiple Depth Cameras and Projectors for Interactions On, Above, and Between Surfaces
 
UIST 2010

By:
Andrew D. Wilson, Hrvoje Benko.



  • Andrew D. Wilson is currently a Senior Researcher Microsoft Research. He has a BA from Cornell University and an MS and PhD to go along with that from MIT.
  • Hrvoje Benko has a PhD from Columbia University. He is currently a researcher at Microsoft Research.
Summary
Hypothesis
The authors of this paper have developed a product called LightSpace that will allow the conversion of any normal space into an interactive 3D ish surface.

Methods
For purposes of testing they had LightSpace displayed at a convention available for public use. People Were encouraged to explore and push the system to its limits. The few hundred people that used LightSpace were observed by researchers while they were using the system so as to record test data and help improve the system.

Results
The system works pretty well as far typical use goes. From a technical perspective, there is no limitation upon how many people can use the system at once, however the technical constraints are not the same as physical constraints. While the system may support infinity people, the room for the system does not. In addition to that performance drops drastically with multiple people on the system. They users found the system easy to use, however like everything else it took some effort on the users part to get used to the systems operation, but that is naturally expected.

Contents
LightSpace as described in this paper uses multiple projectors and depth cameras to serve its purpose. All the components work in tandem to make the entire room a massive Start Trek like reality. This allows LightSpace to turn any and all interactive surfaces into one massive interactive surface that allows users to work in any way shape and form.

Discussion
I find the step in the direction of a Star Trek like reality quite awesome. The ability to have access in all directions on all surfaces seems positively brilliant to me. Admittedly this was a purely lab designed, tested and implemented idea it seems pretty impressive. It seems to be one step closer to 3D with an even closer step to ubiquitous computers.

Tuesday, September 27, 2011

Gang Leader for a Day

Gang Leader for a Day

By:

Sudhir Venkatesh



This book is not something I would typically read. I tend to avoid non fiction like the plague when I can. That being said, this book was fairly intriguing. It offended me at some level at the sheer lack of professionalism on Sudhir's part, at the same time it made me glad that there still existed people who weren't so narrow minded as to just ignore someone different from themselves and to actually be willing to build a relationship and allow them into ones culture.

While I found myself agreeing with Sudhir initially about  the silliness of asking students to study a group of people or a culture in a lab setting, I found myself appalled by his solution to the silly idea. Yes going out and immersing yourself in a culture you wish to study is a fantastic idea. Yes, if you get accepted into it as one of their own its even better. It is still highly important to set boundaries, to know when to call it quits and maintain some sort of objectivity (objectivity being something Sudhir never seemed to have).

It was great to see his willingness to write about a group that was being persecuted and needed help. To bring forth the corruption and prejudice of the times in question is a brave thing. To do it decades after its happened is not brave, but potentially even cowardly. He helped, he certainly did, however he did not live up to his position. His willingness to help those affected by gangs and the drug trade is commendable. His willingness to help a group of people who blatantly break the law is unacceptable to me. At the end of it all to just walk away, crank out a thesis and them later a book? That is just plain callous.

Paper Reading #12: Enabling Beyond-Surface Interactions for Interactive Surface with An Invisible Projection

Enabling Beyond-Surface Interactions for Interactive Surface with An Invisible Projection

UIST 2010

By:
Li-Wei Chan, Hsiang-Tao Wu, Hui-Shan Kao, Ju-Chun Ko, Home-Ru Lin, Mike Y. Chen, Jane Hsu, Yi-Ping Hung.

  • Li-Wei Chan has a Bachelors in Computer Science from Fu Jen Catholic University, along with a Masters from the National Taiwan University. He is also currently a PhD student there.
  • Hsiang-Tao Wu is currently a student at the  National Taiwan University.
  • Hui-Shan Kao is currently a student at the National Taiwan University.
  • Ju-Chun Ko is currently a PhD student at the National Taiwan University.
  • Home-Ru Lin is currently a student at the National Taiwan University.
  • Mike Y. Chen is currently a Computer Science professor at the National Taiwan University.
  • Jane Hsu is currently a Computer Science and Information Engineering professor at the National Taiwan University.
  • Yi-Ping Hung has a Masters and PhD from Brown University. He is currently a professor in the Graduate Institute of Networking and Multimedia at the National Taiwan University.
Summary
Hypothesis
The authors of this paper propose a programmable infrared technique utilizing an invisible programmable markers for interaction beyond the standard surface of the diffused-illumination (DI) multi-touch system. This allows mobile devices with IR cameras to figure out their position in a three dimensional plane based on perceived markers.

Methods
To test their system they essentially tested three (im Lamp, im Flashlight, , im View) different applications.

The user testing was aimed at figuring out the key points (both strong and weak) of this prototype. For the key parts the users were asked to use the 3 available applications to navigate the map and achieve various checkpoints.

Results
For the most part users found all three applications usable. One of the key problems faced was focus, which im Lamp had some trouble with, however it was an issue that could be resolved by using the appropriate focal length. im Flashlight on the other hand had all sorts of trouble with focus.

Contents
The combinations of IR projectors and a colour projector content and markers visible to both the user and the IR camera on mobile devices which helps them calculate their location in 3D space. The use of multi-touch tablets and projectors allow interaction with space beyond the display plane.

im Lamp is essentially a desk lamp for Multi0touch tabletops. This involves an IR lamp wit ha pico projector connected which provided all information wherever the lamp was directed on the table top, thus allowing ease of use and movement of content anywhere on the table top.

im Flashlight is a mobile im Lamp that allows users to project content in a more mobile and versatile fashion. The flash like allows emphasis on relevant areas of the projected content.

im View (possibly the biggest contribution of this paper) is the tablet referred to earlier. It has an IR camera attached and it allows the users to explore an almost 3D geographical area.

Discussion
The use of the newer, cheaper and highly portable pico-projectors is something that fills me with joy. Considering pico-projectors are greatly underrated it's nice to see them be used for such a technical situation. That being said it seems awfully complex of a project. While the idea is genius, the implementation on a flashlight is a bit of reach. Using it with a lamp to project onto a worktable is highly useful and intelligent, ditto for the tablet pc implementation; however I find the flashlight implementation quite sad, especially considering its major focus issues. While it would be fantastic to have that ability, we just don't have sufficiently small and sufficiently decent pico-projectors to be able to pull it off.

Paper Reading #11: Multitoe: High-Precision Interaction with Back-Projected Floors Based on High-Resolution Multi-Touch Input

Multitoe: High-Precision Interaction with Back-Projected Floors Based on High-Resolution Multi-Touch Input

UIST 2010

By:
Thomas Augsten, Konstantin Kaefer, René Meusel, Caroline Fetzer, Dorian Kanitz, Thomas Stoff, Torsten Becker, Christian Holz, and Patrick Baudisch.

  • Thomas Augsten is currently a Masters student specializing in IT Systems at the University of Potsdam in Germany.
  • Konstantin Kaefer is currently a Masters student specializing in IT Systems at the University of Potsdam in Germany.
  • René Meusel is currently a student at the Hasso Plattner Institute.
  • Caroline Fetzer is currently a student at the Hasso Plattner Institute.
  • Dorian Kanitz is currently a student at the Hasso Plattner Institute.
  • Thomas Stoff is currently a student at the Hasso Plattner Institute.
  • Torsten Becker is currently a student at the Hasso Plattner Institute.
  • Christian Holz is getting his PhD in HCI from the University of Potsdamn in Germany..
  • Patrick Baudisch is a Computer Science professor at the Hasso Plattner Institute.
Summary
Hypothesis
The authors of this paper have hypothesised that current touch screen, table top options are insufficient in their area. They propose a rear projected floor which solves the area issue and then recommend using users ability to walk and manipulate the screen with actions performed by their feet. They propose the paper to serve an engineering need as opposed to a scientific need.

Methods
All tests done by the authors are for the purposes of design improvement. They did one user study to figure out how the appropriately implement the inactive state of the buttons. They had participants walk across four 'buttons' so as to trigger two and not trigger two. Their behaviour and strategies were noted, and they were fully debriefed in a verbal interview later.

The second test was done for the purposes of designing a mechanism for a conceptual hotspot. The authors wished to find out if there was one certain area of the food that was widely found intuitive, thus allowing the authors to make one hotspot as opposed to allowing personalising per user. They had users use their preferred foot to locate the hotspot on a predetermined cross-hair, and then to confirm their choice using a wireless transmitter. The transmission recorded the Frustrated total internal reflection (FTIR) image, which allowed the building of a model.

Another study was conducted to optimize the precision with hotpot. Users selected a hotspot of their choice. They were then asked to access and use the keyboard to type up a sentence using varying sizes of keys.

Results
The results for the first test showed that there were far too many possible options people found intuitive meant that there was no widely accepted way that could be implemented. Some of the strategies provided were found to be impractical due to scalability concerns. However they were able to find four (tap, stomp, jump and double tap) feasible options that could implemented and scaled up to a fully operational model.

The results for the second test showed that most users conceptualized their touch as ideal (ie including the arch). FTIR uses pressure points and thus does not include the arch, however the authors felt considering its wider acceptability, they could refine the FTIR image to fit the user expectations of the conceptualized ideal input.

The results of the the hotspot optimization test showed that the smallest size of keys was obviously far too small due to the high error rate. Both error rate and task time sky rocketed as key size decreased.

Contents
The screen described in this paper is implemented using layers of various (projection screen, acrylic, silicone, and thick glass) materials. Using pressure distribution, shoe sole matching and various other features the authors were able to develop algorithms that had the ability to recognize users, differentiate personalized hotspots and classify various motions such as walking, tapping, etc. Using FTIR and other technology the authors were able to develop an algorithm that tracked user balance and used said information to control an output screen.


Discussion
While this paper brings the fiction of a Star Trek similar reality one step closer, I personally feel in its currently implementation its purposes are probably limited to gaming. Surely no one wants to type using their feet. That being said a platform like this would be quite brilliant for interactive gaming (making Wii's thankfully obsolete). This system also has the additional problem of being so interactive so as to essentially prevent people with insufficient or no motor control from using it. I really liked their approach to engineering and the amount of effort put into making the system as user friendly as possible. Where it wasn't possible to have a standardized system they worked hard to provide the ability to personalize.

Thursday, September 22, 2011

Paper Reading #10: Sensing Foot Gestures from the Pocket



Sensing Foot Gestures from the Pocket

By:
Jeremy Scott, David Dearman, Koji Yatani, Khai Truong.

Presented at UIST 2010.
  • Jeremy Scott has a Bachelors of Science, a Masters of Science and a PhD in Pharmacology and Toxicology from the University of Western Ontario. He is currently employed as an Assistant Professor at the University of Toronto.
  • David Dearman is currently a PhD student at the University of Toronto.
  • Koji Yatani is also a currnet PhD student at the University of Toronto.
  • Khai Truong is currently an Associate Professor in the C.S. Department at the University of Toronto.
Summary
Hypothesis
The authors aim to study the ability to use foot based gestures to control phones in users pockets. Using the results of that study they further developed a working system to recognise foot gestures and take action on the phone.

Methods
For the purposes of studying and getting initial results, the authors had participants perform certain tasks that were mapped to targets on the phone with their dominant food (right, for all participants). The participants were asked to hit 43 targets across 3 different flexions and rotations. Once they had used this study to figure out precisely how the system would need to be set up, they held another study to test foot gestures and how they could control phones using an Iphone, accelerometers and multiple cameras similar to the preliminary study.

Results
Their testing resulted in 82%-92% accuracy in classification of foot gestures, which is fairly reliable when it comes to controlling the phone. They also learnt that having the cell phone at the users side as opposed to a pocket in the front or rear allowed for greater accuracy. Sadly it showed that the system occasionally confused similar gestures, especially those that were in neighbouring degree spaces.

Contents
The authors spent a fair bit of time gathering their data, getting a list of all potential gestures possible and needed and then studied the ease of recognition of said gestures. Once that was all in place the authors worked through to take their gathered data and integrated it into a system that they designed and was able to interface with a smart phone. For appropriate data gathering and testing they used multiple cameras located at various locations and angles, and then had them working in conjunction with accelerometers so as to interface with a phone.

Discussion
While the actual study and implementation is rather intriguing and "cool" in it self, I personally find the over arching concept to be quite meaningless. The fact that the authors never resolved the issue of differentiating between gestures that need to be interfaced and "gestures" that are really just walking/running/etc makes this even more useless. In addition the fact that this system greatly limited the location of the phone in so as to be accurate is even more unacceptable. While this may prove to be a remotely useful tool for users that lack visual feed back, it's use is greatly limited and its ability is even further limited.

Tuesday, September 20, 2011

Paper Reading #9: Jogging over a Distance between Europe and Australia

Jogging over a Distance between Europe and Australia

By:
Florian Mueller, Frank Vetere, Martin Gibbs, Darren Edge, Stefan Agamanolis, Jennifer Sheridan.

Presented at UIST 2010.
  • Florian Mueller has a Bachelor of Multimedia from Griffith University, Brisbane, Australia. He has a Digital Media Degree from Furtwangen University, Germany. He also has a PhD in Interaction Design from The University of Melbourne, Australia. He is currently a Fulbright Visiting Scholar at Stanford University.
  • Frank Vetere is currently a senior Lecturer at the University of Melbourne in the department of Information Systems. His interests lie in HCI.
  • Martin Gibbs is currently a Lecturer at the University of Melbourne in the department of Information Systems.
  • Darren Edge hold an undergraduate and a PhD from Cambridge University. He is currently an HCI researcher at Microsoft Research.
  • Stefan Agamanolis has a Bachelor of Arts in computer science from Oberlin College along with an MS and a PhD in Media Arts and Science from MIT. He is currently an Associate Director of a research institute at Akron Children's Hospital.
  • Jennifer Sheridan is currently the Senior User Experience.
Summary
Hypothesis
The authors state that Jogging over Distances is made for runners that believe running to be a social activity and require more than one runner to make the act of running a success. They hypothesise the the use of spatial communication channel and thus allowing runners to co locate with another runner at a different geographic location will result in better runs for users.

Methods
The study's test results are based upon the report of 17 participants over a period of 14 runs of varying lengths. Considering the objective nature of this study, all results were gathered based on a qualitative interview with the runners immediately following their run in which they had used Jogging over Distances.

Results
Their testing showed that using Jogging over Distances did allow for better social experience when it came to running. The participants felt that the system did a great job of simulating a legitimate two person run. The runners were also given a new perspective and a better understanding of their running abilities due to the hear sensing ability and the co-location capability.

Contents
The authors of this paper have attempted to build a product that would tie into a users exertions and allow social interaction over a distance through them. To successfully implement this the authors have combined pre-existing technology with new internet based technology so as to create a more realistic experience for the users. The use of co-location allows users to know precisely where they are in terms of the other runners pace and performance.

Discussion
While from a technological aspect this paper was highly intriguing, I wasn't personally too fond of the concept. While the ability to run with someone a few continents over is highly intriguing and has a rather serious "cool factor", I just find there to be no practical application for this project. Running is plenty challenging and difficult without having to add bulky equipment just to make it possible to have a virtual runner with you. There are plenty of products out there that for all purposes and intents replace a running partner. However, from a purely technical perspective, I am highly intrigued. It's a use of the technology available is fantastic, and it does serve a pretty decent purpose if some really is that lonely.

Thursday, September 15, 2011

Paper Reading #8: Gesture Search: A Tool for Fast Mobile Data Access

Gesture Search: A Tool for Fast Mobile Data Access

By:
Yang Li
Presented at UIST 2010.
  • Yang Li has a PhD in Computer Science from the Chinese Academy of Science and a Post Doctoral from in EECS from University of California at Berkeley.
Summary
Hypothesis
The Author of this paper hypothesises that considering the large amount of data stored on modern mobile devices, gesture search will allow for quicker and more efficient data recall compared to current and more traditional methods of accessing data on modern mobile devices.

Methods
In addition to being available to public, Gesture Search has been tested in laboratory settings. The Authors asked participants answer a survey post (which used a 5 point Likert scale) their use of Gesture Search.

Results
Gesture Search received a median rating of 4.5/5 out of 5000 ratings from users. Of the 59 people that responded to the survey the vast majority agreed that Gesture Search was useful and usable (giving a media score of 4/5). Survey takers found Gesture Search most useful for searching through contacts (media 4/5) which makes sense considering 66% of the users used it for contact search, however they were not as excited by its use for searching through applications (median 3/5). The Author believes the greatest reason for a reduced score for applications was due to the fact that it wasn't used often for applications since most used applications fit on the homepage. The Author believes with the wide variety of applications available, eventually Gesture Search will find itself to be used more frequently for finding applications.

Contents
This paper dealt with showing the superiority of Gesture Search over other available search methods and to explain the methods employed to implement Gesture Search. Gesture Search essentially allows users to "gesture" (essentially draw/graffiti) a character on their touch screen mobile devices and then Gesture Search uses the character defined on the screen to search whatever application is open. Gesture Search uses a timeout to delimit multistroke gestures, thus making it more user friendly. The ability to avoid UI and just scribble on the screen allows for ease of use while the speed of the search allows for efficiency.

Discussion
Having personally used something like this (I'm not certain if it was Gesture Search or not) I'm surprised to see how truly complex its implementation is. Gesture Search is certainly a far more polished version of what I have used in the past, and it shows. Gesture Search is a fantastic tool for users with touch screen based mobile devices. It's main use would perpetually remain contacts search, however the fact that it has the ability to interact with practically any applications makes it highly versatile and makes it possible to customize it for whatever needs may arise in the future.

Paper Reading #7: Performance Optimizations of Virtual Keyboards for Stroke-Based Text Entry on a Touch-Based Tabletop

Performance Optimizations of Virtual Keyboards for Stroke-Based Text Entry on a Touch-Based Tabletop

By:
Jochen Rick
Presented at UIST 2010
  • Jochen Rick has a Bachelors and Masters in Electrical Engineering and a PhD in Computer Science all from the Georgia Institute of Technology. He is currently an Assistant Professor at Saarland University.
Summary
Hypothesis
The Author hypothesises that in cases of equipment where a larger touch screen keyboard is the most viable text input option (such as Interactive Tabletops) and there is no available tactile feedback to register key strokes the best possible solution is a stroke-based virtual keyboard as opposed to a more traditional tap based keyboard. The Author states that such a keyboard would allow for quicker user inputs with an optimized keyboard.

Methods
To test and thus prove his hypothesis, the author had eight participants complete a series of 1152 stroke sequences through 4 points in a certain order, where each stroke had a beginning, a middle and an end. The participants had previous experience with the interactive tabletops and were asked to stand for the period of the testing so as to have better access to the tabletop. They targeted completing the tasks as quickly as possible. They did these tests for a large number of keyboard layouts as listed in the paper.

Results
The stroke based typing showed a minimum improvement in speed of 0% with Quikwriting and a maximum improvement almost 41% with Square ATOMIK. Speed increases were substantial for all other keyboard layouts tested, thus showing the obvious superiority of Stroke based typing.

Contents
The paper dealt with first proving the superiority of Stroke based typing and then later with developing an ideal virtual keyboard layout that maximized stroke based typing speeds. Using the results gathered from participant testing the author was able to see that OptiII was the layout that allowed for the greatest words per minute on stroke based typing. Using the results it was concluded that it would take far too long (10^20 years) to use an exhaustive algorithm to find the optimized keyboard layout. The author was however, able to use an annealing process followed by hill climbing based upon Square ATOMIK and Hex ATOMIK to come up with 2 optimised layouts that were 0.5% faster in the square category and 7% faster in the Hex category.

Discussion
Considering the fact that various phones are starting to use "stroke based" keyboard input (most notably the Swype feature for Android keyboards) it is exceedingly nice to see a paper about trying to optimising the keyboard layout. While technically the paper is limited to interactive tabletop applications, surely the basic concepts of stroke based typing are carried across platforms. While it is true that most users are highly used to and comfortable with QWERTY, I feel there would be widespread interest in any optimised keyboard that was put out there due to the speed offered by it.

Tuesday, September 13, 2011

Paper Reading #6: TurKit: Human Computation Algorithms on Mechanical Turk

TurKit: Human Computation Algorithms on Mechanical Turk

By:
Greg Little, Lydia B. Chilton, Max Goldman, Robert C. Miller

 Presented at UIST 2010
  • Greg Little has an Undergraduate from Arizona State University and a PhD from MIT. He currently works at MIT.
  • Lydia B. Chilton has an Undergrad from MIT in Economics and EECS. She also has a Masters from MIT also in EECS. She is currently a Graduate student at the University of Washington.
  • Max Goldman is currently a Graduate student at MIT.
  • Robert C. Miller has a Bachelor of Science and a Masters of Engineering from MIT along with a PhD from Carnegie Mellon. He is currently an Associate professor at MIT.
Summary
Hypothesis

The authors of this paper hypothesised that the use of TurKit would allow for better coding of programs requiring Human Computation. The ability to make MTurk calls as just function calls is supposed to allow users to utilise more of their traditional approaches to programming, allow the functions to be used as building blocks for more complex algorithms while allowing the code to be more traditionally readable. The use of the Crash-and-rerun programming model reduces the expense of Human Computation by allowing the ability to run multiple times without repeating computationally expensive work.

Methods
To evaluate the performance of their they used the results of 20 different experiments run over the period of year using TurKit. The experiments included iterative writing, blurry text recognition, website clustering, brainstorming, and photo sorting.

Results
The 20 experiments resulted in an expense of $364.85 involving 29.731 assignments involving HITs. This showed that TurKit resulted in faster computation times. The fact that waiting for human action takes an order of magnitude longer than computation, it's shown that TurKit is suitable for most applications.

Contents
The authors showed TurKit to be greatly useful, however admitted that it wasn't scalable. They tested it thoroughly, implemented it in the best possible format, however it proved itself to be un-scalable which made it less impressive.

Discussion
While I didn't quite understand everything about human computation, I did understand a fair about it. To me anything that helps things run faster and more efficiently is a step in the right direction. Using Crash-and-Rerun seemed ingenious considering it completely reduced a redundancy in the calls. The fact that TurKit has great usability and allows exceptional ease for writing scripts.

Wednesday, September 7, 2011

Paper Reading #5: A Framework for Robust and Flexible Handling of Inputs with Uncertainty

A Framework for Robust and Flexible Handling of Inputs with Uncertainty

By:
Julia Schwarz, Scott E. Hudson, Jennifer Mankoff, Andrew D. Wilson.
Presented at UIST 2010

  • Julia Schwarz has a BS in Computer Science from the University of Washington and is currently working on her PhD in HCI at Carnegie Mellon.
  • Scott E. Hudson has a BS and an MS in Computer Science from Arizona State University, along with a PhD (also in Computer Science) from the University of Colorado. He is currently a professor at Carnegie Mellon teaching and researching HCI.
  • Jennifer Mankoff has a BA from Oberlin College and a PhD in Computer Science from the Georgia Institute of Technology. She is currently a professor at Carnegie Mellon teaching and researching HCI.
  • Andrew D. Wilson is currently a Senior Researcher Microsoft Research. He has a BA from Cornell University and an MS and PhD to go along with that from MIT.
Summary
Hypothesis
The authors state that they have developed a method to better handle ambiguous user inputs so as to allow a more interactive environment in the future for humans to interact with computers. They find the current system of handling interactive user input to be insufficient and and improper, thus inhibiting users from maximising their interaction with the computer. By modifying the handling of user inputs they have enabled their system to take in and deal with all sorts of uncertain inputs thus making the system more use able even for people with motor impairments.

Methods
To test their framework they chose to attempt various tasks and actions that a user would be required to do under normal use of a computer. Tasks attempted included responding to dialog boxes, scrolling, and text boxes that take an audio input and transcribe it to text. All the tests were done by the authors themselves except for the testing of buttons by users with motor impairments.

Results
Upon testing their framework operated as expected. It handled uncertainty extremely well thus opening up ways for greater interaction between humans and computers. In regards to operability for users with motor impairments, they saw large a reduction in errors when using their probabilistic method of handling inputs.

Contents
The entire objective of this paper was to challenge the current attitude towards the handling of interactive user inputs and demonstrate the benefits of a more probabilistic method. They demonstrated how the improvement of technology aided with proper handling of uncertain inputs greatly improved the operation of systems accepting interactive user input thus allowing greater interaction.

Discussion
I found the step away from the rigid and inhuman precise interpretation of human actions to be very refreshing. The fact that they designed a system that takes inputs and interprets and treats them as human inputs as opposed to taking in human inputs and immediately converting and then treating them like computer inputs is bound to be useful. This method of handling, interpreting and treating inputs should really open the way for greater functionality being added to interactive operations on computers. In addition this should also greatly help users with disabilities by allowing them to be less precise in their actions, thus letting them be more human.

    Tuesday, September 6, 2011

    Paper Reading #4: Gestalt: Integrated Support for Implementation and Analysis in Machine Learning

    Gestalt: Integrated Support for Implementation and Analysis in Machine Learning

    By:
    Kayur Patel , Naomi Bancroft , Steven M. Drucker , James Fogarty , Andrew J. Ko , James A. Landay
     Presented at UIST 2010.

    • Kayur Patel has a BS in Computer Science and HCI from Carnegie Mellon, an MS in Computer Science from Stanford, another MS in Computer Science and Engineering from University of Washington and is currently working on his PhD in Computer Science and Engineering from University of Washington under James Fogarty and James Landay.
    • Naomi Bancroft has an Undergraduate degree in Computer Science and Linguistics from University of Washington and is currently works for Google.
    • Steven M. Drucker has a BSc in Neutral Sciences from Brown University, an MS in Brain and Cognitive Sciences, and a PhD in Arts and Media Technology both from MIT. He is currently a Principal Researcher for Microsoft Research.
    • James Fogarty holds a BS in Computer Science from Virginia Tech and a PhD in HCI from Carnegie Mellon. He is currently an Assistant Professor of Computer Science and Engineering at University of Washington.
    • Andrew J. Ko has a BS in Computer Science and Psychology from Oregon State and a PhD in HCI from Carnegie Mellon. He is currently an Assistant Professor at University of Washinton Information school and an Adjunct Assistant Professor in Computer Science and Engineering also at the University of Washington.
    • James A. Landay has a BS in EECS from UC Berkeley along with an MS and PhD in Computer Science from Carnegie Mellon. He is currently a Professor of Computer Science and Engineering at the University of Washington.
    Summary
    Hypothesis
    The Authors of this paper hypothesised that users would be able to do a better job of debugging code for the purposes of machine learning using their Gestalt Development Environment compared to more traditional methods of debugging. They felt the ability to implement a classification pipeline and then to analyze the data as it moves through the pipeline all the while being able to switch easily between the implementation and the actual analysis of the code.

    Methods
    For testing their hypothesis they selected a group of 8 participants that matched the target audience of Gestalt, and from there proceeded to have a set of baseline results and a set of Gestalt results. The baseline results involved the creation, modification and execution of scripts. Participants created visualizations by calling functions using a provided API which allowed the reproduction of all Gestalt Visualizations. Both baseline and Gestalt used the same data table structure however the baseline data table structure did not keep track of information generated across the pipeline. To further test the setup users were asked to solve two problems, one of sentiment analysis and another of gesture recognition. Both problems had bugs that replicated common programming errors introduced in it by the authors.

    Results
    The participants unanimously preferred using Gestalt for their troubleshooting purposes over baseline. They were able to find more bugs faster and similarly fix more bugs, quicker.

    Contents
    The authors of this paper spent considerable amount of time on their testing and had a rather elaborate set up to make the testing as complete and thorough as possible. They placed fair bit of importance on the data table structure and made it perfectly clear that it was most responsible for the effectiveness of Gestalt.

    Discussion
    Even though I'm not all that much into heavy programming even I can appreciate the usefulness of this. Debugging has historically been a pain for me and most people I know when it comes to programming and frankly any tool that can help make the torturous process easier is a tool to be worshipped. I felt the authors did a spectacular job of make sure their testing was fair and did a great job following a scientific process of testing comparing a set of control results with a set of experimental results. They also did a great job of stating at the very start what they were trying to do, proved that they were right and then stated exactly what their system could not do.

    Paper Reading #3: Pen + Touch = New Tools

    Pen + Touch = New Tools

    By:
    Ken Hinckley, Koji Yatani, Michel Pahud, Nicole Coddington, Jenny Rodenhouse, Andy Wilson, Hrvoje Benko, Bill Buxton.

    Presented at UIST 2010.

    • Ken Hinckley has a PhD from University of Virginia and is currently a Principal Researcher at Microsoft Research.
    • Koji Yatani is currently working on his PhD in HCI at the University of Toronto.
    • Michel Pahud has a PhD in Parallel Computing from the Swiss Federal Institute of Technology and is currently working at Microsoft Research.
    • Nicole Coddington has a Bachelors in Visual Communications from the University of Florida and currently works for HTC as a Senior Interaction Designer.
    • Jenny Rodenhouse has a Bachelors in Industrial Distribution from Syracuse University and is currently working as an Experience Designer for Microsoft.
    • Andy Wilson has a Bacherlor of Arts from Cornell University along with a Master of Science and a PhD From MIT. He is currently a senior researcher for Microsoft.
    • Hrvoje Benko has a PhD from Colombia University and is currently employed at Microsoft as a Microsoft Research.
    • Bill Buxton has a Bachelor of Music Degree from Queen's University and is currently a Principal Researcher at Microsoft Research.
    Summary
    Using typical human interaction with pen and paper as a model the authors have attempted to replicate and improve upon that concept. This new device uses a "pen" the write, the "touch" of a hand to move things around and the combination of both (pen+touch) allows the user a plethora of other options to manipulate their work with.

    The Authors observed users make a physical arts project (scrap book) and catalogued the gestures, and then optimised the gestures so as to simplify the programming and improve upon the efficiency. Using this data they were able to create a functioning prototype that they could have other users test. The test users were asked to replicate a set of tasks previously done using paper with clippings, etc.

    For the most part the users responded highly favourably. People found the use of the pen+hold features to be highly intuitive, and yet struggle to maintain functionality with gestures without explanation. The concept of Objects was the truly underlying concern when it came to the current prototype.

    This paper is supposed to be more theoretical than practical. Their aim has been to do more research about how people typically use their hands to work and then try their best to replicate those behaviours. Behaviours specific to the dominant vs non-dominant hand, how the pen is handled when idle, etc have been the corner stone's of this project and thus the testing done was limited.

    Discussion
    Considering the absolute lack of effort put into the final prototype and how most of the work done was theoretical makes this quite fascinating. Considering the vast amount of data they've collected for this paper, a more thorough implementation of their idea with slightly more intuitive controls would be an exceedingly useful product. Reading the paper its easy to see that they have an extremely good grasp of what needs to be done, what the difficulties are, and roughly how to work around those problems.  Really it's now just a matter of sitting down and trying to work through all the implementation issues, because theoretically they've got a model that mimics human behaviour extremely well.

    Paper Reading #2: Hand-On Math: A page-based multi-touch and pen desktop for technical work and problem solving

    Hand-On Math: A page-based multi-touch and pen desktop for technical work and problem solving

    By:
    Robert Zeleznik, Andrew Bragdon, Ferdi Adeputra, Hsu-Sheng Ko.

    Presented at UIST 2010
    • Robert Zeleznik received his Undergraduate and Masters degrees from Brown university. He is currently the Director of Research at Brown University along with being he Co-Founder of his own company.
    • Andrew Bragdon received his Undergraduate and Masters degrees from Brown university. He is currently a PhD student at Brown University.
    • Ferdi Adeputra is currently studying Applied Math and Computer Science at Brown University.
    • Hsu-Sheng Ko is currently studying at Brown University.
    Summary
    The scientists from Brown were driven by their belief that the lack of user friendly input methods into Computer Algebra Systems (CAS) resulted in insufficient use of the system, thus resulting in unoptimized solving of mathematical problems using pencil and paper. To counter this they developed a platform that aimed to be almost as intuitive to use as paper whilst still being able to interface with CAS and thus combining the free form ability of paper with the computational power of CAS.

    To test their idea they had participants try and use "Touch Pad" as if it were a piece of paper. Problems were done as if they were being done on paper with the added functionality of CAS. They were asked to create and manipulate pages, use the available graphics options to graph equations and manipulate them, use "PalmPrint" to switch between modes and draw annotations in various colours, and manipulate page contents using TAP gestures.

    While there was some resentment about the bulkiness of the device, the participants found concept highly intriguing and very useful. For basic functions (i.e. to be used as paper) required no instructions. The gesture technology required considerable about instructing, however once the training was complete, most users found it fairly easy to use the technology. Users felt it was a great tool when it came to help with the organization of their documents and found the mathematical features useful, however felt greater functionality (i.e. beyond high school level math) would obviously be preferred. While users found the bimanual features on the product to be problematic, on the whole the device was favoured.

    Testing had a large scope due to the experimental nature of the device. Features included Page Management (which allowed users to almost replicate having multiple pieces of paper laying around; the allowing of page "folding" to make problem organization simpler), Gesture recognition, and bimanual techniques such as Palm Print (which allowed users to control colour, etc with the palm of one hand while continuing work with another hand.

    Discussion
    I find this idea to be extremely intriguing and very useful. Considering the abusive amount of math needed in engineering and the wide spread use of programs like maple/matlab, the ability to have similar features on something like a tablet PC would be spectacular. It would allow math students to learn and manipulate problems in ways that have never been done before. It would allow users the ability to experiment with problem solving accurately. While this product may not be where it needs to be for it to be a marketable product, the authors have definitely come up with a spectacular idea that could, with some improvements, completely revolutionize how we do math.

    Thursday, September 1, 2011

    Paper Reading #1: Imaginary Interfaces: Spatial Interaction with Empty Hands and without Visual Feedback

    Imaginary Interfaces: Spatial Interaction with Empty Hands and without Visual Feedback




    By:
    Sean Gustafson, Daniel Bierwirth and Patrick Baudisch.



    Sean Gustafson is a PhD student under Patrick Baudisch working in HCI at Hasso Plattner Institute in Potsdam, Germany.
    Daniel Bierwirth is a Co-founder of two companies that specialise Mobile applications. He holds an MSc in IT-Systems Engineering from Hasso Plattner Institute in Potsdam, Germany.
    Patrick Baudisch is a Professor of Computer Science at Hasso Plattner Institute in Potsdam, Germany. He is also the Chair of their HCI lab. His research revolves around miniaturisation of mobile devices and their touch inputs.

    Presented at UIST 2010.

    Summary

    In this paper the researchers are trying to achieve the ultimate level of miniaturisation possible by completely doing away with a screen. This system relies completely on the users short term memory and their ability to keep track of what they were "drawing" and be able to pick up where they left off with absolutely no feedback from anything but their mind. They used a few tests using random test subjects to prove that it was in fact possible for people to use such a super miniaturised device without any external feedback.


    1. Using their non-dominant hand to create an L shape to be used as a point of reference, users were asked to try and draw 3 different styles of images from memory. They had to do Graffiti characters, repetitively draw a shape in the same spot and finally they had to draw a few diagrams that required more than one stroke. They hypothesised that users would have fewer errors with the graffiti and get more data on decreased accuracy when having to use multiple strokes for images. Over the period of this they were able to show that the users could recreate the objects that they had been asked to create, from their short term memory with a fair bit of accuracy. It was noted that most errors occurred during diagrams that required multiple strokes.

    2. The 2nd set of experiment saw a new set of users. Again they used their non dominant had to create an L shape to use as a point of reference. Users were asked to draw an image, rotate 90 degrees and find a point on that image. Some were asked to not rotate and then asked to find a point on the image they'd drawn. As hypothesised, there were far fewer errors with users who did not have to rotate before selecting a point on their drawn image.

    3. For third and final experiment the objective was to select points on an imaginary coordinate system using the non dominant hand as the axis. The hypothesis stated that there would be greater errors as the coordinate points got further away from the origin. Users were asked to make an L with the non dominant hand and then given coordinates to point to. As previously hypothesised the errors drastically increased as users got further away from the origin.

    Considering the results of these tests, it was concluded that the system worked better when the user was stationary and worked best while things were still fresh in the users short term memory. As time increased, user memory faded and errors increased.

    Discussion

    While these tests may have proved the concept beyond doubt, its current implementation leaves much to be desired. A product like the one tested would probably be cheaper considering the lack of a screen, extremely portable, and the novelty of it would certainly make it very appealing. That being said, it relies far too heavily on a users ability to picture things and have a strong short term memory. In a consumer driven world where we're companies are working hard to make products simple and easy to use, a product that requires this much thinking and mental activity is immediately limiting itself to a niche market.

    In addition to that, when considered from a reliability perspective this system just doesn't have the ability as currently implemented to compete. As technology improves, screens are liable to get cheaper and more energy efficient. While they may lack the novelty of a Star Trek like product, I feel the business community would greatly prefer a product that with guaranteed, tangible reliability as opposed to something that depends on how well I can remember images.

    On Computers

    To accurately argue about the existence of souls or lack thereof in Computers one needs to first and foremost define the existence of a soul. Considering the fact that I'm an engineer and not a philosopher, I'll take the most obvious explanation available to me: if it can feel emotion, it has a soul. While this may be fiercely argued upon, I feel its the safest assumption to make if such a conversation is to reach a productive end. Continuing from this assumption one can safely answer Aristotle's question regarding plants: Do they have souls? They most certainly do not. While Plants may respond to stimuli (heat, lack thereof, moisture, etc) it is similar to a mechanical system and and truly nothing like emotion. From what little biology I have taken in my life I have learnt that emotion is caused by rather complex chemical reactions. Plants have chemical reactions, but these are limited to capturing sunlight and making glucose. As far as I have heard, no one's ever come across a plant that had any chemical reactions happening that would imply the existence of emotion.

    Moving away from plants, on to computers. Do computers have soul? Just like plants no, they do not. It doesn't matter how powerful and well programmed AI one makes, all its actions are pre-determined. Even the probabilistic actions are pre-determined. AI out there that tries to mimic human action and human thought, can come awfully close, and even when there exists AI that can make the same irrational choices as humans, it still won't have a soul for the simple reason that it feels no emotion. Humans can be taught and conditioned; however they cannot be programmed with the precision of computers. Any teaching or conditioning done to a person is at the mercy of that persons emotional stability. All it takes is a few chemicals to mix in the wrong order, and the next thing you know, all that teaching and conditioning goes out the window.

    If one was to argue that emotion could be built into computers some how, it would still be arguable that computers lack a soul. What separates humans from computers is the simple fact that we don't need to be taught to have emotions. A sufficiently talented programmer with the proper tools can program anything. But can a programmed piece of hardware have a soul comparable to that of a human? I don't think it can. It feels sad not because it legitimately understands what sadness is about. It feels sad because it has an algorithm that says its time to feel sad. The lack of built in emotion is what keeps computers from having souls. If one day I turn my laptop on and it suddenly starts complaining about the fact that it feels used by me, then I will willingly accept that computers have souls, because my computer just came to life on its own accord without anyone's help and became self aware. It felt used, without me having to program it to feel used. Until such time, as far as I am concerned computers and plants are both for use without guilt, they lack emotion, therefore they lack souls.

    Blog -1

    Shashwat Tyagi

    shashwatt@tamu.edu

    5th Year Senior

    I am taking this class because it's a class I felt would be highly interesting and it fit in very well with my graduation plans. Human Computer Interaction means anything from me typing this up to Star Trek level interaction, and both are highly intriguing topics.

    I am for the most part a hardware guy, however I greatly enjoy the algorithms and abstract programming.

    10 years from now I am hoping to be in a management position in an engineering firm somewhere.

    If Microsoft could make an operating system that did not eat up my hard disk every few weeks, I think that would be a HUGE leap in computer science.

    Given the opportunity, I feel Edgar Allan Poe would be an extremely interesting person to talk to. His ability to come up with twisted, horrifying tales whilst maintaining a sense of reality was exceptional. To know what drove that, and what it took to do that would be fascinating to me.

    My favourite shoes are my running shoes. They're grey, light, breathe extremely well. They're my favourite shoes because I enjoy running, or athletic activity in general and they allow me to do what I enjoy.

    I would like to be able to speak and read fluent Russian. I've always found the Russian language and culture incredibly intriguing, and there are so many fantastic plays, poems and novels written in Russian and I would like the ability to read them without having to worry about things being lost in translation.

    I am Indian, I sound like an odd cross between an Australian, an Irishman, an Englishman and occasionally a Scotsman.