ESA Space Challenge at Junction hackathon: winner team

At Junction hackathon 2017, one of the challenges was given by the European Space Agency (ESA): building Space Data Enriched Applications using the Copernicus APIs with open remote sensing datasets.

Together with Jelena Pantović, Polina Rozenshtein, and Roman Kotelnikov, we developed the idea that we named re(n)laks (a combination of two words in Norwegian meaning salmon and pure).

We won the challenge and received a voucher from ESA for the business development consultations and coaching. Let’s see what we do next 🙂

 

Ideas from the book

‘Everybody lies’ by Seth Stephens-Davidowitz

If you are into data science, computational social science, social computing, big data or any other of the fancy terms for the power of data science applications on the large datasets we constantly obtain nowadays — then you will likely, as me, find interest in this book.

Seth obtained his PhD in economics. His ideas and the approaches to big data are so novel and out-of-the-box, at least from a perspective of a non-economist like me, that it is not a surprise to me that he has also held positions at Harvard, Google and New York Times.

Summary of the findings in the book:

Are Freudian slips real?

A simple answer is NO. The frequency with which people are found to make mistakes of the type fuckiest instead of funniest and a penestrian instead of a pedestrian while they type online turns out to be no larger than the frequency with which randomized bots would make such mistakes.

However, on another note, Freud might have been right… The Oedipus complex.

Is incest present in sexual phantasies?

In 9-16% (for women vs. men) of all the porn sites searches, such phantasies are present.

Another apparently random question explored is about successful basketball players. Is it really true that they are more likely to come from difficult family and neighborhood backgrounds?

Again the answer is NO. On the opposite, kids from mid or well-off families have higher chances of succeeding in basketball.

This question is my favorite and has nothing to do with digital big data as we come to think of them. What makes great racing horses?

Many have collected diverse data about racing horses for many years, including the parents, how they behave at certain times and about their different physical attributes. However, one man came to an idea to measure their heart sizes, in particular, the left ventricle. This turned out to be the single best predictor of most successful racing horses. Thanks to such finding he has convinced the owner of a later champion horse, the American Paharon, not to sell it away at a point when he intended to.

 

 

To the Moon and back

Exactly one year ago, last summer, I was lucky to meet in person one of the people of whom, while being a kid, I read with such an awe and sort of a distant inspiration — Buzz Aldrin.  As you must know, with Neil Armstrong, they were the first people to walk one the Moon (I like less the version where they say that he is ‘the second person on the Moon’).

Nevertheless, Buzz serves as ISU chancellor, hence we are lucky that he visits us during many SSP (Space Studies Program) events.
While I am not present this summer at SSP17 in Cork, Ireland, Facebook reminded me and here is a repost to the SSP16 blogpost I wrote exactly one year ago after meeting with Buzz.

And still not clear about what exactly he meant as to the comment about Montenegro 🙂

Where are you from?
— Montenegro, … ex Yugoslavia.
Ah, it’s always been mysterious, the Mrs Broz’s property.

Computational Social Science

I find computational social science [1] to be a nice term coined for the relatively new interdisciplinary field that can be summarized as computational methods applied on large datasets to investigate social sciences. Hence, it involves several subdisciplines: computational sociology, computational economics, computational linguistics, and even computational sociolingustics, culturomics and many others.

After attending the Third International Conference on Computational Social Science, IC2S2 2017, in Cologne, Germany, during the past week, I feel I can say some things about the current situation in the field. However, my blogpost will only be able to scratch the surface of some of the topics that resonated with me, hence, not being representative of the field. Given around 120 accepted talks and 80 posters, it was impossible to follow all the results. Nevertheless, I hope that my summary that follows showcases couple developments and trends that are worthwhile and interesting.

Social Media for Health

Ingmar Weber and Yelena Mejova (Qatar Computing Research Institute) summarized in their tutorial talk how researchers in the field investigated health.

One of the most severe diseases of today — depression, has been tackled a lot. In particular, to infer the likelihood of suicidal ideation, researchers used data from semi-anonymous support communities on Reddit [2]. They showed it possible to predict (to a certain degree) from previous discussions when someone will start having suicidal discussions. In another study, Instagram photos posted by depressed individuals were found more likely to be bluer, grayer, and darker. Moreover, people performed worse in predicting depression from Instagram photos, compared to the algorithms. In both cases, we see that algorithms running on online data could be detecting psychological diseases: how will be this used in the near future?

If discussed examples inspire you to develop interventions and support, then following examples show how interventions can be risky if not carefully designed. Namely, on Flickr, there exist unfortunate groups that support and promote anorexia (pro-anorexia).  Seemingly a good sign, researchers also found counter-groups (pro-recovery), that try to reach the members of the first groups and inform them of the negative consequences and dangers of starving oneself. However, research results show that such pro-recovery groups are only counterproductive, at the moment. They entrench the pro-anorexic individuals in their stance [4]. As another example, we heard of a short-lived project that aimed to warn the friends of individuals vulnerable to depression (using similar methods as described above). However, what happened is that some malicious people used this service to intentionally harm such vulnerable people. These examples show that no matter how good intentions you have, you need to carefully attend to their possible effects online — the effects can be unpredictable. Hence, for those of us who still want to help others, the (research) question is how to best design positive health interventions using social media?

Investigating Psychology using online data

Have you ever wondered how many of the reviews on Amazon or TripAdvisor are fake? Those same reviews that you might be basing your decisions on. The answer (for which I do not have a citation) is up to 30%. Nevertheless, I believe that the crowdsourced content platforms are still working well for my purpose —  likely because of the efforts by companies to deal with the fake contributions.

Now, one possible way to detecting fake reviews is using network analysis methods (such reviews have different patterns and frequency compared to real ones). However, during the IC2S2, I have learned about another method that is equally fascinating. Namely, there are established theoretical principles about deceptive statements versus true ones:

  • honest statements are richer in detail (The theory of Reality Monitoring),
  • they contain more contextual references to people, times and places (Criteria-based Content Analysis),
  • fake statements avoid information that have potential to be checked (Verifiability Approach).

Computational Social Science approach now is ‘just’ to develop methods that will evaluate reviews based on these three principles, and you have a fake reviews detection approach [5].

The study about psychological and personality profiles of political extremists [6] that fascinated me is the last one I will discuss herein.

You have probably, too, wondered like me — why some people hold as extreme views. While in some areas extreme views can potentially be useful or at least benign, in most of the areas, they are known to be harmful: either for the person holding the views, or for the people surrounding her, or both.

Recruitment into radical Islamic movements has renewed global interest in political extremist views. This time, given two competing psychological theories about profiles of political extremists, researchers used computational methods on large datasets to asses which theory agrees better with the data.

Nicely summarized to competing hypotheses are:

  • extremists differ psychologically from mainstream activists regardless of their left or right ideology (Collective Behavior Hypothesis),
  • left- and right-ideology activists differ psychologically from each other, independently on whether they are extremist or mainstream (Moral Foundations Hypothesis).

Perhaps surprisingly, researchers have found that the first hypothesis agrees with the (Twitter) data. If confirmed on other datasets, this result would mean, for instance, that radical pro-environmentalists or anarchists have more in common with Neo-Nazis or Neo-Confederates than one would perhaps expect.

While probably still too early to interpret in the above manner, I want to point to one last detail that I found incredibly curious in discussed study. Being different on all the Big Five Personality Traits (openness, agreeableness, consciousness, extroversion, neuroticism), political extremists are higher from all other types of people the researchers investigated on openness to experience. Given that openness is defined to include active imagination (fantasy), aesthetic sensitivity, attentiveness to inner feelings, preference for variety, and intellectual curiosity, how perplexed are you by the result? There are also moderate positive relationships of openness with creativity, intelligence and knowledge. However, another attribute of openness, related to psychological traits of absorption and hypnotic susceptibility, might seem more expected.

 

Hope that presented ideas have left you inspired and interested to read more from computational social science as they did with me.


[1] Lazer, David, Alex Sandy Pentland, Lada Adamic, Sinan Aral, Albert Laszlo Barabasi, Devon Brewer, Nicholas Christakis et al. “Life in the network: the coming age of computational social science.” Science (New York, NY) 323, no. 5915 (2009): 721.

[2] De Choudhury, Munmun, Emre Kiciman, Mark Dredze, Glen Coppersmith, and Mrinal Kumar. “Discovering shifts to suicidal ideation from mental health content in social media.” In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, pp. 2098-2110. ACM, 2016.

[3] Reece, Andrew G., and Christopher M. Danforth. “Instagram photos reveal predictive markers of depression.” arXiv preprint arXiv:1608.03282 (2016).

[4] Yom-Tov, Elad, Luis Fernandez-Luque, Ingmar Weber, and Steven P. Crain. “Pro-anorexia and pro-recovery photo sharing: a tale of two warring tribes.” Journal of medical Internet research 14, no. 6 (2012).

[5] Kleinberg, Bennett, Maximilian Mozes, and Arnoud Arntz. “Preprint: What’s in a name? Using named entities for verbal deception detection.” (2017).

[6] Alizadeh, Meysam, Ingmar Weber, Claudio Cioffi-Revilla, Santo Fortunato, and Michael Macy. “Psychological and Personality Profiles of Political Extremists.” arXiv preprint arXiv:1704.00119 (2017).

One Young World (OYW) Summit

Here, I just repost the video to the Peace and Reconciliation Session. I took the part in the session with the delegates from other conflicting countries. Actually, I have felt quite fortunate, as we in Montenegro did not experience even close to what people experienced in the countries from which other delegates in this session came from.

Professor Meghan O’Sullivan moderated the discussion in the Dublin main convention centre during which we represented ‘different perspectives’ of a range of conflicts we experienced. I represented The Balkans with several other delegates from neighbouring countries.

Full Session on the OYW website.

Space, security and religion

What do space, security and religion have in common? During the summer 2016, one unique country was instrumental in tying those concepts together — Israel. Israel put itself in the spotlight as it hosted for two months the 29th International Space University (ISU) Space Studies Program (SSP) at the Israel Institute of Technology, Technion, in Haifa.

The SSP program gathers each year around 100 participants from over 30 countries: students from different fields, space professionals and other space enthusiasts; plus nearly another 100 staff: teaching associates, academic and logistic coordinators, core lecturers who stay throughout the program; plus many guest experts visiting for shorter periods. This year’s SSP had many firsts: first time in the Middle East, first ISU space selfie (see the gif below), first ISU drone on a stratospheric balloon.

Spending two months in Haifa as teaching associate for one of the SSP projects (Space Big Data) was an expectedly intense and inspiring team work experience. In addition to that, you learn that Mideastern Israel is more Western than you would guess and a much safer country than you would expect… You find that many more people holding diverse positions in the space field are religious than you would anticipate, and you get reminded by the well-preserved archeology around Israel that our space age takes only permilles of time elapsed since past civilisations have flourished here.

First ISU space selfie, taken by the EROS-B satellite operated by ImageSat International

If you were a bit worried before coming to Israel after reading many recent news about conflicts in the region, then you would start rethinking your image of what being safe means. During our stay, no major incidents happened in Israel, while at the same time in the ‘safe’ parts of the world, where many of us came from, several larger attacks took place: Nice Bastille day and Normandy church attacks in France, Munich shooting and Ansbach bombing in Germany. And if you have an Israeli friend who was five minutes from the Chelsea explosion while visiting New York at the time when this attack recently happened, then these questions are just reinforced in your mind.

SSP16 class photo with ISU Chancellor Edwin Buzz Aldrin. Photo credit: Nitzan Zohar

Spending time in Israel, I have learnt, is inevitably going to involve more religious experiences (in Serbian) and talking about religion than in many Western countries. How could it not be, when literally in each part of the country you find one of the holiest places for one of the three Abrahamic religions, and when sometimes, not being well informed you might even visit one of them without knowing it? Jerusalem, both Old and New Towns built only in stone, a central pilgrim destination for many people across the globe, reminds us of unity on several levels. First, it hosts the holiest places for Christians (the Church of Holy Sepulchre) and Jewish (Temple Mount) and the third most holy for Muslims (Al-Aqsa Mosque). Second, The Church of Holy Sepulchre is simultaneum mixtum (a church in which public worship is conducted by adherents of two or more religious groups) of different Christian denominations: Greek, Syriacs, Egyptian Copts, Ethiopians and Armenian Orthodox, and Roman Catholic, in addition to having Muslim doorkeepers (in Russian).

View of Old Town Jerusalem, Israel

When one of the projects in SSP is investigating the state of our current possibilities for establishing human settlement on Mars (aMarte); when another is exploring the use of artificial gravity technology (Startport1) to support human space exploration; when our evening guest lecture talks about the Breakthrough Initiatives, aimed at finding evidence of technological life beyond Earth, and about light-powered space travel to Alpha Centauri… then the topics of space and religion touch, blur and spark in your mind. You get reminded that we humans have not forgotten our deepest, eternal longing questions: who are we, where does our world come from, are we alone, where do we go…? We, as humanity, are trying to answer these and many other fundamental questions from different angles and perspectives, that maybe sometimes intersect and meet…

Bringing security back to perspective, you hope that these space visions even more grandiose than the Sagan’s Pale Blue Dot and the ones inspired by the Blue Marble will remind, if not us then our descendants, how the only security we should think of in the future should be our common one in the vast universe of possibilities. You choose how hopeful you want to be.

Social Networks

One of the fruitful fields within computer science today, Social Network Analysis (SNA), proliferated thanks to the many online social networks and active engagement of their users. Think of Twitter, Facebook, LinkedIn, Flickr, Swarm etc. SNA in particular enabled analysing some of the classical sociological theories within this new, online context. Hence, a synergy between sociology and computer science is asked for, and the new field termed Computation Social Science emerged. Our research belongs to this field.

Homophily in communication

Homophily is a tendency of similar individuals to connect. The famous saying illustrates is it simply: birds of a feather flock together. Homophily has been known already earlier in sociology, however recent SNA studies confirmed and quantified it on a large scale and in diverse settings.

We investigated homophily in Twitter communication on the basis of semantic features of users’ communication content, and also based on their social status in the Twitter network.

  1. In other words, for semantics, we asked, whether users who in general talk on similar topics talk more to each other. Then we also looked at specific topics, and measured for which of them this tendency is more pronounced.  We also found that users of similar sentiment talk more to each other.
  2. The question for social status is in simple words whether those who are more central and important in the social network tend to talk to others who are also more central and important. The answer is again positive.

While these results are expected from the knowledge from sociology from before, we extended the insights into the relationship between social status and semantic features of user tweets. In that regard, we find that the users who are more active and popular tend to use more diverse semantic content. At the same time, the most active users tend to have negative sentiment of their tweets.

A novel aspect of our study is that we investigated homophily on interaction links: based on mentions between users, instead of only following.  In this way, instead of one time and persisting links, we could assign the strength to the links and also we could define when they are formed or disconnected. Thanks to this approach, we found that for users to start communicating, it is important that their tweet topics are similar (value homophily); while, at the same time, the reason for disconnection of once an active link is more likely to be their status difference (status heterophily) than differences in topics.

Šćepanović S., Mishkovski I., Gonçalves B., Nguyen Trung Hieu, Hui P. “Semantic homophily in online communication: evidence from Twitter“, Online Social Network and Media, Elsevier, 2017 (to appear).

Smart Energy Grid

Residential setting: CIVIS

Most of the research in this field we conducted for smart grid in residential setting as part of CIVIS project. The main aim of CIVIS was to develop a social energy app to change energy practices in homes towards more sustainable.

Learning from others

First, we have performed a large literature review, in order to understand what has worked and what did not in the energy interventions conducted so far:

Lean, iterative and innovative approach

Alongside, we created mock-ups and conducted user studies in several of the partner universities, in the process of iterative, user-centered app design.

Originally, the name for the app was EnergyUp. At a later stage, we selected YouPower as a more appropriate name.

What we did at Aalto as part of CIVIS

The user study and the startup/innovation project we did at Aalto are described under YouPower on this website.

How we designed the CIVIS app in the end

There is also an external, CIVIS project link for the YouPower app.

The publication about the design:

Yilin Huang, Hanna Hasselqvist, Giacomo Poderi, Sanja Scepanovic, Filip Kis, Cristian Bogdan, Martijn Warnier and Frances M. T. Brazier, YouPower: An Open Source Platform for Community-Oriented Smart Grid User Engagement, in: Proceedings of the 14th IEEE International Conference on Networking, Sensing and Control, pages -, IEEEE, 2017.

 


Industrial setting: Green Big Data

During my visit at CERN, we also worked towards improving energy efficiency, but this time of a data centre. We received a dataset with energy consumption and computing statistics of a large computing centre: CSC — IT Center for Science. CSC provides computing services to the Finnish scientific community, but also to physicists from other countries, as it belongs to the Tier-2 of Worldwide LHC computing grid from CERN.

During this visit, I collaborated with:

We looked at the correlations between the application and system level logs and the energy consumption of the data centre. We clustered the computing nodes based on the vmstat and RAPL variables. Then we also showed that energy consumption on a node can be estimated from these variables.

Our results are accepted for Workshop on Energy-Aware High Performance Computing, EnA-HPC 2017.

Kashif Nizam Khan, Sanja Scepanovic, Tapio Niemi, Jukka K. Nurminen, Sebastian Von Alfthan, Olli-Pekka Lehto. “Analyzing the Power Consumption Behavior of a Large Scale Data Center“, Workshop on Energy-Aware High Performance Computing, EnA-HPC, June 2017.

Human dynamics

It is a privilege getting a chance to analyse the largest released mobile phone dataset for research community by that time. Data of Orange Telecom from Cote d’Ivoire are released for Data for Development challenge (D4D) in 2012.

So we asked: could such anonimized mobile communication (call timings, locations and person ids) serve as a socio-economic proxy indicator for the country? The answer is yes.


Mobility <–> communication frequency

For instance, from the averaged frequency and length of communication, one can well observe important events in the country, as well as correlate those with mobility of people.The black footprint shows mobility (calculated from calling locations). The red shows calling frequency and the blue, the duration.  We can see from the Figure 1 that the mobility and frequency correlate (we also calculated this to confirm). Interestingly, the duration of calls has a different pattern, and does not positively correlate with either frequency, nor mobility. Our conclusion is that people, when on the move,  communicate more in terms of number of calls, but when they want to make longer calls, they prefer to be in one place. Not that surprising when one thinks about it.Now, for all the 3 of the activities, one can identify easily New Year hours, Christmas, and Easter. Without previous knowledge, we could from those ananymized data find out that Cote d’Ivoire is a country in which religion is important (for a large part of its inhabitants).The graph (b) in the Figure 1 shows and averaged daily traveled distance and we have identified that the 3 peak periods match with December, 11, 2011 Parliamentary elections, then Africa Cup of Nations 2012 in football, where Cote d’Ivoire played in the final, and  Bouake Carnival and Fete du Dipri.

Fig1Figure1: Mobility vs. calling frequency and duration


 Economical status <–> radius of gyration

Another interesting finding is that a relatively simple quantification of human mobility, such as radius of gyration, can tell us a lot about different regions in the country.

This African country has its economic and development center in the city of Abijan, on the south east on the coast, while the northern and western parts are less developed, and on many indexes considered poor. Radius of gyration measures how far on average people do travel (very simplified interpretation, but serves our purpose). On the map (a) in Figure 2, we use the darker color for the regions in which people have a larger radius of gyration.

Now, it is apparent how the people in and near Abijan have relatively low radius of gyration, showing that they do not travel too far from their home location, and  people from the poor regions have considerably larger radius of their trajectories. That is because they do not have all the necessary services (hospitals, schools, ports) nearby, and they need to travel further and more frequently to fulfill their basic needs.  Moreover, it is rather clear how the whole country seems to gravitate towards the wealthy south-east coast and Abijan.

In the graph (b) on the right in Figure 2, we have averaged the radius of gyration for the 3 regions:

  1. Wealthy Abijan,
  2. Poor North-West,
  3. The whole Cote d’Ivoire.

Our aim is to show that this measure of human mobility, which is otherwise shown to be consistent over time for one country, differ in different regions, serving as an economic status fingerprint.

Fig5Figure2: User radius of gyration statistics


Administrative units and economic centers <–> commuting

Finally, to my own surprise,  using only the communication data, we were able to find the home and work locations for users, and based on those to calculate the the commuting network. Applying one of the common network partitioning algorithms (modularity detection), we were able to identify regions of commuting, that incredibly well match with with the administrative regions in the country. And for those areas where the obtained commuting regions do not match (mostly red, blue, green and cyan), we can easily identify the reasons: the borders are distorted by the economical centers (Abijan, Yamoussoukro, Gagnoa) that attract commuting.

On the left map in Figure 3, we show the important economic centers, that are identified after we have run a standard PageRank algorithm on the commuting network.

Fig6

Figure3: Regions and centers of commuting importance


Summary

While this work shows a lot of ideas obtained based on what we already know about this particular country, there are at least a few points that amazed me-lover of data analysis and convinced me of its power:

  • When we call is different to how long we call.
  • How we move has to do with how wealthy we are.
  • We are free, but are we aware that we still move a lot under some invisible constraints?

For the rest and more details of our analysis, you can have a look at our PLoS ONE article.

Šćepanović, S., Mishkovski, I., Hui, P., Nurminen, J.K., Ylä-Jääski, A., “Mobile Phone Call Data as a Regional Socio-economic Proxy Indicator,” PLOS One, 2015.