This paper solves the two-armed bandit problem when decision makers are risk averse. It shows, counterintuitively, that a more risk-averse decision maker might be more willing to take risky actions.
The reason relates to the fact that pulling the risky arm in bandit models produces information on the environment - thereby reducing the risk that a decision maker will face in the future. This finding gives reason for caution when inferring risk preferences from observed actions: in a bandit setup, observing a greater appetite for risky actions can actually be indicative of more risk aversion, not less.
Studies which do not take this into account may produce biased estimates.