This lesson is still being designed and assembled (Pre-Alpha version)

MicroData Onboarding

This is the onboarding page of CEU MicroData. If you are here for the first time, welcome to the team! If you are revisiting something or upgrading this page it’s at least as good!

This is how this page works: We have lessons designed for you that are going to guide you through some of the most important tools, data and procedures we work with. It is designed to help you throughout your time here at the MicroData. Hopefully it contains information that you will often revisit and learn from. Nevertheless, it definitely doesn’t replace learning from your peers. Whenever you run into any questions that a quick google search / Stack Overflow can’t answer, make sure that you ask them on Slack. Even better: If you find out the answer you may want to add it here, so that others will find it more easily.

Enjoy both the lessons and your time at MicroData!

Prerequisites

Make sure that you have a signed contract. Even though some lessons are useful in general, it is best if you have full access to the microdata resources (e.g. server, slack, github, etc.) first. To gain access to every resource, you need a CEU-issued e-mail address, which you should receive shortly after signing your contract.

Schedule

Setup Download files required for the lesson
00:00 1. Introduction Key question (FIXME)
00:00 2. Using the terminal How to navigate in the terminal?
How to execute commands and start applications?
What are some of the most important command line tools?
00:00 3. A quick introduction to Git and Github What are the basic terms used by version control systems?
Which files are contained within the .git directory?
How to install git?
How does the basic collaborative workflow look like?
What are some of the most important git commands?
00:00 4. How to use the haflinger server How to connect to haflinger server?
What is the structure of the server?
What is the general workflow on the server?
00:00 5. bead: Chaining your data and code together How do you ensure that your data products are reproducible?
00:00 6. Key datasets in MicroData Which are the most important datasets and how to join them together?
What is an LTS dataset?
Where do I find the datasets on the server?
00:00 7. How to work with the data What are the general rules about working with data?
How to handle sensitive data?
00:00 8. Tools in MicroData What is a programing tool and why we use them?
How to read a tool manual and understand the outputs?
00:00 9. Best practices How to name files and variables?
What code style do we use?
How to ensure reproducible research?
00:00 10. Stata style guide How to name variables?
What code style do we use?
00:00 11. How to use csvkit How to open data with csvkit?
How to select certain rows and columns of the data? How to append them after filtering?
How to sort and describe basic characteristics of the data?
00:00 Finish

The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.