Note: The answers to the following questions were written in January of 2015. Some of the answers are a bit outdated.
When asked to briefly describe myself, my response tends to vary with the context. Here are a few ways that I have described myself in the past:
Short answer 1: I am a person who loves to learn, improve upon things, create new things, and help people.
Short answer 2: I am a guy who grew up in Virginia's Shenandoah Valley, fell in love with the beauty of nature and art, then became a scientist and engineer.
Short answer 3: I am a computational/theoretical astrophysicist and software engineer. As a graduate student, I learned that many people believe that it is impossible to simultaneously be a good scientist and a good software engineer. I took that as a challenge.
Short answer 4: I am a scientifically and artistically inclined individual who often views things a bit differently than the majority of his peers. This is partially because I have a somewhat unusual background. I come from a rural region of western Virginia. Neither of my parents completed high school, and yet I have a Ph.D. I spent much of my childhood building things, watching documentaries, thinking, and reading Audubon field guides and other science-related books. I have a natural tendency to think more independently than most people; bandwagon thinking has always irritated me. I've also never really been impressed with the opinions of people based merely on their status as an "expert" or "authority" on a subject.
A: Physics is the most fundamental science. I knew that learning more about physics would allow me to understand many things more easily. Furthermore, studying physics requires a rather deep understanding of mathematics, which is often useful when trying to understand things that are not immediately related to physics.
A: I was initially drawn to astrophysics because astrophysical phenomena and astronomical size scales are simply amazing. Astrophysics is one of the most active, and quickly-growing subfields of physics because so many mysteries still remain unsolved.
A: I enjoy being able to transform an idea into a usable item by simply typing. With software engineering, an idea can quickly become its own entity, capable of performing actions, such as creating other entities, solving problems, or simplifying tedious tasks. This is pretty amazing, if you pause to think about it.
A: I like to discover trends, relations, and patterns in data. It's always fun to see something in data that no one has seen before. It is even more exciting if the insight is beneficial or surprising in some way. Data science also involves other things that I enjoy, such as software engineering.
I haven't worked as a data scientist, but I have the basic skills that are needed. Other skills would be honed after obtaining a data science position. Several of my collegues have become data scientists after completing their doctoral work and they all say that the work is enjoyable. They also indicated that the move from physics / astrophysics to data science was rather easy.
A: I am passionate about teaching, learning, discovering creative solutions to problems, helping people, and improving things. These are all related, of course.
A: There have been many. A few notable ones include
- An IR laser touchscreen system that could, in principle, turn any flat rectangular surface into an input device. I also wrote software to translate the input data into x-y coordinates.
- The volume rendering component of GSnap was purely for fun (see my projects page).
- My Pretty Parametric Plot generator (see projects page).
A: I see myself working as either a senior data scientist or software engineer. I may begin to consider attempting to return to academia as a professor, if a suitable position becomes available at a small undergraduate-only university.
A: I have not yet needed to use technology X heavily. It is likely that I haven't had time to try it out myself or I haven't found that particular technology to be interesting enough to learn, just out of curiosity. For instance, I haven't used Apache Hive, Apache Pig, or Apache Storm because I haven't needed them yet. Also, my experience with SQL, Java, and R is very limited. I have only needed to use these technologies briefly, but I would quickly learn more about them if I needed to do so.
A: In terms of my value to an employer, my biggest weaknesses are likely the following:
- I am only willing to work an average of 40 hours per week and I can no longer take a significant amount of work home because I have a wife and a young child. During graduate school, I worked as much as 100 hours per week, but other parts of my life suffered. So I am capable of working extended hours, but I am not willing to do so at the moment.
- I am allergic to smoke, spores, certain fragrances, and certain types of pollen. Consequently, I become ill due to allergic reactions at least twice per year. Fortunately, having a decent-paying job with decent health benefits should allow me to finally be treated by an allergist.
- I can be perfectionistic and obsessive at times. This can slow me down a bit—particularly when writing papers, proposals, and e-mails. The issue is improving with time.
- Although I haven't been diagnosed, I am pretty sure that I have a condition related to dyslexia. I read, write, and type more slowly than most people. I am also very slow with mental arithmetic and I have several other symptoms associated with dyslexia. This slows me down in some situations. For instance, I am horrible at taking detailed notes in a lecture / training situation; I mostly rely on my memory in such cases.
A: If I were forced to choose only one strength to label as my "greatest strength," I would likely choose among the following:
- Intellectual independence
- The ability to quickly identify the underlying causes of problems.
A: In terms of software engineering and algorithm design, the most challenging problem that I have encountered involved developing an efficient volume renderer, capable of making realistic images of simulated galaxies while using less than 2 GB of RAM. I used a custom, computationally-inexpensive smoothing kernel to convert the raw snapshot data (masses, positions, stellar age, gas temperature) into a continuum. I developed an algorithm that only needed to store one depth-slice of voxels at any given time, so that only a small fraction of the data needed to be stored in RAM. I used experimentation and heuristics to eliminate unnecessary computational work. For instance, I found efficient ways of skipping over regions of space that were effectively empty and I detected situations in which effort was wasted on sub-voxel rendering. The code was multithreaded. I performed cache optimizations and used "hot" and "cold" function attributes in order to help the compiler to optimize the code.
A: In this particular situation, I was able to test the solution by simply looking at the output images and measure the memory usage and execution time. There is no general way to verify the correctness of an algorithm or an implementation of an algorithm. Depending what is being verified, I might (1) compare the output with analytic results, (2) compare the output with previously-verified algorithms that perform the same task, (3) compare the output with experiments or observations, (4) test the algorithm with a large variety of inputs, or (5) provide a formal proof.
A: My design process depends upon the complexity of the problem and the context. If the project is very complex, I find that it is best to do considerable planning ahead of time. At minimum, I sketch out the main components and sub-components, write all of the requirements, and describe how each requirement will be met before I begin writing code. This often involves designing one or more algorithms or data structures. Once the initial planning has been completed, I review the design in order to avoid needing to make major changes later on. After I begin writing and testing code, additional changes are usually made in order to make the code work more efficiently, make it easier to use, or make the project easier to maintain.
If the project is less complex, the process is less formal. After getting an idea of what I want to do, I begin writing code (or at least writing class definitions) immediately. Then member functions (methods) are defined and I have a working prototype. From there, more features are usually added, bugs are identified and fixed, and the code is frequently refactored.
A: The disagreements that I have had, thus far, have centered on the choice of methodologies and technologies that should be used. Here are two examples:
- When we were evaluating tools to use as components of a Big Data framework, my supervisor became very enthusiastic about a proprietary tool. I wasn't convinced that this particular tool would be a good choice, but I tried it out for a few days and then pointed out all of the drawbacks that I had found. For instance, the tool only worked when it was able to log in to a particular authentication server, the source was not available for inspection and modification, it required the system's hard drives to be formatted with a proprietary file format that could only be used by this tool, and achieving the highest performance would require us to pay for a very expensive license to unlock extra features. My supervisor agreed that my preferred set of tools was the better option because they offered more flexibility and they were free.
- When collaborating on a paper with my supervisor, I wanted to use Git as a collaboration tool so that we could directly edit the source as we wrote the paper.My supervisor preferred to use Google docs to collaborate on the text, then spend time converting the document to the appropriate format at a later stage. My supervisor prevailed in this case. It became clear that he had no interest in learning to use Git and it would have been a waste of effort for me to insist on Git.
A: I have worked collaboratively on papers and proposals, but my experience collaborating on software is limited to high-level discussions of requirements, new features, and user interface design. For instance, I periodically describe potential new code features to my supervisor. My supervisor then provides feedback. Sometimes the feature is modified a bit before being implemented. In other cases, the feature is implemented in the form that I originally proposed. If the feature is less important to my supervisor, it is delayed until higher-priority tasks have been completed. I would like to collaborate on the source code and perform code reviews, but I have not yet had the opprotunity.
A: I prefer a mixture of the two: a situation in which most of the work is done independently, but ideas are discussed and progress updates are shared with collegueas frequently enough that everyone in the group is aware of the everyone else's work. This allows work to be accomplished at a decent pace because the team members are not spending all of their time in meetings. When a problem is encountered or a major decision needs to be made, everyone can contribute without needing a lengthy primer.
A: First, I determine whether the problem depends on input data or certain special circumstances. If it does, I vary the input and the operating environment and usage pattern to identify the circumstances that trigger the problem. My next step depends upon my level of familiarity with the code...
If I wrote the software or I am very familiar with it for other reasons: I am usually able to find the cause of the bug by quickly reading through the parts of the code that I know are likely causing the issue. If this method fails, I either write additional debugging messages and run the code again or I use a debugger.
If someone else wrote the software and I have access to the source: I try to become familiar with the code by using a debugger and reading through the relevant source. If I am unable to isolate the cause of the bug in a reasonable amount of time, due to code complexity, I import the source into an IDE so that the IDE can assist me as I examine the problem.
If I do not have access to the source: There are fewer options available, in terms of identifying the specific cause of the bug, but the problem might still be resolved. I search the Web to see if anyone else has encountered the same issue. I make sure that I have the latest version of the software. If that fails to resolve the issue, I try older versions of the software, if they are suitable. Examining the binary files and the logs (if there are any logs) is sometimes helpful. I usually end up contacting the developers by filing a bug report or sending a direct e-mail message.
A: My hobbies include, in no particular order:
- Software development
- Algorithm design
- Digital photography
- Ornamental gardening (bonsai & water gardening)
- Vegetable gardening
- Designing and building miscellaneous items
- Developing clear explanations for things that are frequently taught poorly in college or high school courses (usually mathematics topics)
A: I am not interested in a full-time management position and I will likely remain uninterested in such a position for the next few years. I would be comfortable in a supervisory role as long as I am able to spend at least 80% of my time doing creative work (i.e., science and engineering).