The Themes of Entrepreneurship Discourse : A Data Analytics Approach

Scholars are devoting heightened attention to the language of entrepreneurship and to its influence on the cognition, behaviors, and outcomes of entrepreneurs and their stakeholders. However, the primary themes that constitute entrepreneurs’ language are unexamined. In this partially-inductive study, we identify the most common themes in entrepreneurship discourse and explore how they have changed over time. To map the themes in entrepreneurs’ language, we use data analytic techniques coupled with text mining algorithms to analyze a longitudinal corpus of entrepreneurial discourse. Our findings reveal five dominant and recurring themes in entrepreneurship discourse – marketing activities, technology-oriented entrepreneurship, digital entrepreneurship, professional investment, and new venture entrepreneurship – and illustrate how these themes are evolving. By examining the key themes in the discourse of entrepreneurs and charting their transformation over time, our study makes theoretical and methodological contributions to entrepreneurship research. We identify the areas where the academic literature seems to be lagging practitioner discussions and suggest that scholars should evaluate research for how closely topics are calibrated with the main themes in the discourse of entrepreneurs. Our findings also produce practical implications for entrepreneurs by identifying the main themes receiving attention, which allows entrepreneurs to evaluate if the topics that comprise their day-to-day discourse align with the themes emphasized in the larger body of entrepreneurship discourse.

Despite the strides made by studies of entrepreneurs' language, research has not attempted to identify the common themes in entrepreneurial discourse.Scholars generally adopt an interpretivist approach (cf.Leitch, Hill, & Harrison, 2010), which involves examining how discourse is constructed and interpreted during social interactions.The focus of this work is capturing rich representations of higher-level discourse constructs, such as narratives and stories, rather than understanding word-, phrase-, or theme-level language.Instead, research primarily emphasizes how entrepreneurs use language and the outcomes of language-use and does not devote attention to the content and structure of entrepreneurial discourse (e.g., Lounsbury & Glynn, 2001).This represents an important omission in studies of entrepreneurs' language because without a detailed understanding of the themes of entrepreneurial discourse it is difficult to identify the topics that are at the center of entrepreneurs' communications and attention.
To address these omissions in prior research, in this study we examine two related questions: what are the themes that comprise entrepreneurship discourse and how have these themes changed over time?To explore these questions, we use a partially-inductive methodology (cf.Gioia, Corley, & Hamilton, 2013), coupled with research from linguistics and entrepreneurship, to analyze the themes that are present in a corpus of entrepreneurship discourse.Specifically, we combine MapReduce programming, a Big Data methodology (cf.Asllani, 2014), with traditional statistical methods to develop a text mining algorithm that generates insights into the contextualized themes of entrepreneurship

LITERATURE REVIEW
The linguistic (or "discursive") turn in the social sciences (e.g., Harre, 2008) emphasizes the power of language to shape how reality is perceived, interpreted, and described.Social scientists' growing interest in language is motivated, in part, by the linguistic paradigm in philosophy, which laid the foundations for studying the influence of language on human cognition (Wittgenstein, 1922;cf. Lycan, 2012).Disciplines as disparate as law and criminal justice (e.g., Maynard, 1988), medicine (e.g., Greenhalgh, 1999), public health (e.g., Greene & Brinn, 2003), and agriculture (e.g., Morgan, Cole, Struttmann, & Piercy, 2002) find that language-use is not "just talk" but can influence decision making, the persuasiveness of communication, the transfer of knowledge, and how people and organizations are evaluated (e.g., Breunig & Roberts, 2017).For example, scholars studying environmental policy decisions find that the language used to frame policies influences decision making, persuasion, and evaluation (cf.Feindt & Oels, 2005).Rydin (1999), for instance, examines the language of sustainability-focused environmental policies and, quoting Edelman (1988, p. 103), argues that environmental policy is influenced by "language games that construct alternative realities, grammars that transform the perceptible into non-obvious meanings, and language as a form of action that generates radiating chains of connotations while undermining its own assumptions and assertions."The language contained in types of discourse, such as narratives, is so influential it has been argued that "all of our knowledge is contained in stories and the mechanisms to construct and retrieve them" (Schank & Abelson, 1995, p. 1).Because of the role of language in the construction and transmission of human culture, scholars even argue that a more accurate name for the human race is homo narrans, that is, "narrative humans" (Niles, 1999).
The growing attention to linguistic issues in other social science disciplines spurred organizational researchers to consider the role of language in business contexts.Language can manifest in organizations in any form that discourse can take (Chatman, 1980), including direct inter-personal interactions or written texts.Studies examine the role of language in microphenomena, such as employee identity construction and sensemaking, and macro-oriented phenomena, such as organizational change and legitimation (cf.Vaara, Sonenshein, & Boje, 2016).In exploring these phenomena, studies analyze the language used in texts such as annual reports (e.g., Subramanian, Insley, & Blackwell, 1993), shareholder letters (Jameson, 2000), earnings press releases (e.g., Henry, 2008), and corporate websites (Pollach, 2003).
However, most entrepreneurship research examining discourse does not examine the specific words and themes that constitute the language of entrepreneurs.For example, Nicholson and Anderson (2005) analyze the role of discourse in sensemaking and sensegiving about entrepreneurship.They examine how the language about entrepreneurship contained in myths and metaphors presented in a British newspaper influences the image of entrepreneurship portrayed to readers.Similarly, Steyaert (2007, p. 463) argues that the social construction of entrepreneurship is conceptualized through "a myriad of linguistic forms and processes," including discourse (Perren & Jennings, 2005), dramatization (Downing, 2005), metaphors (Dodd, 2002), and storytelling (Pitt, 1998).Roundy (2014) examines how the narratives constructed by social entrepreneurs influences their ability to secure professional investment.Although these studies increase understanding about how entrepreneurs use language to construct discourse and communicate, they do not examine specific word-or theme-level patterns.These studies also do not base their findings on a large corpus of text; instead, they focus on the discourse of small samples of entrepreneurs and ventures, rather than examining a broad sample of discourse across sectors.
A study by Parkinson and Howorth (2008) is an exception.They interview social entrepreneurs and then use corpus linguistics software and critical discourse analysis to identify common linguistic themes such as "local issues," "collective action," "geographical community," and "local power struggles."Moss, Renko, Block, and Meyskens (in press) and Parhankangas and Renko (2017) also examine word-level linguistic characteristics in their analyses of how entrepreneurs communicate about their ventures on crowdfunding platforms.They find that entrepreneurs' linguistic styles impact audiences' resource allocation decisions.
These studies and others (e.g., Lounsbury & Glynn, 2001;Martens et al., 2007) improve our understanding of the role of language and discourse in entrepreneurial activities.However, important issues remain unaddressed.First, as described, scholars examining entrepreneurial discourse primarily adopt interpretivist and social constructivist perspectives (Fenton & Langley, 2011) that are based on ethnographic and qualitative methods.Interviews are often used to capture language.However, as Achtenhagen and Welter (2007) argue, "the use of language in entrepreneurship research has potential far beyond the use of interviews" (193).Entrepreneurship researchers generally do not use quantitative methods focused on measuring and mapping the precise composition of language.Studies are also not based on a large corpus of text, in part, because analyzing such data is challenging using hand-coding

RESEARCH METHODS
To answer our guiding research questions (i.e., what are the most prominent themes in entrepreneurship discourse and how have these themes evolved over time), we used a Big Data programming approach (MapReduce) and text mining software to analyze a large corpus of web content.Big Data is defined as data with the following characteristics: high volume, velocity, and variety (Katal, Wazid, and Goudar, 2013).Big Data is generated by sources such as social networks, web server logs, web page content, banking transactions, and financial markets.A unique set of processing and storage techniques are used to handle the challenges of collecting and analyzing Big Data (Asllani, 2014;White, 2012).Linguistic data can be analyzed with text mining methodologies, described in detail in the next section, which are used to process large amounts of text and to identify non-obvious patterns in a corpus (i.e., a collection of text; Feldman & Sanger 2007).Text mining reveals patterns and quantifies emerging keywords and phrases, which provide insight into a corpus's linguistic structure and themes (Baker et al. 2008;Morley & Bayley, 2009).
Due to the complexity and size of our dataset, we created a modified version of a traditional word-count algorithm (Dean & Ghemawat, 2008).Using a word-count algorithm with a large corpus can be challenging because it requires significant time to process the text in the corpus.We modified a MapReduce algorithm (described in detail in the next section) to run in a distributed file system (a Hadoop cluster with four nodes) and to perform the embarrassingly parallel computations in reduced time."Embarrassingly parallel computing" is a programming concept used to describe computation problems that can be divided into a large number of parallel tasks with little effort (Herlihy & Shavit, 2012).Our word-count algorithm is a typical parallel computing task, which is used to make data analysis more manageable.
The lack of prior theoretical work on the themes of entrepreneurial discourse suggests the appropriateness of exploratory, partially-inductive research design.Inductive research is appropriate when it is not clear a priori what specific constructs (or, in our study, words and themes) should be measured.Inductive studies generate data-driven theoretical and empirical insights rather than testing a priori theoretical frameworks.With a purely inductive design, the researchers design a study with limited (or even no) preconceptions about how a phenomenon works and allow the data to guide what questions are asked and, ultimately, what theories are informed.
Since we use guiding research questions about the themes of entrepreneurial discourse to focus our analysis, our study is appropriately described as partially-inductive (cf.Gioia, Corley, & Hamilton, 2013).A benefit of this approach is that it limits the influence of the preconceived notions and assumptions of the researchers about what themes are important -or should be important -in entrepreneurship.Minimizing the influence of such assumptions is critical because one of the main aims of the study is to understand if the themes of practitioner discourse align with, diverge from, or challenge the main topics examined by entrepreneurship scholars.If instead, we tested for themes identified from the entrepreneurship literature a priori, we would be unlikely to uncover themes that are unique to practitioner discourse.
In addition to the distinction between deductive and inductive approaches, there are also important differences between qualitative and quantitative methods for text analysis (cf.Berelson, 1952;Roberts, 2000).A text can be analyzed using qualitative methods that rely on researchers hand-coding texts for themes and subthemes (cf.Bowen, 2009).The advantage of this approach is that the researcher is directly analyzing the data, rather than using a computer-automated text analysis (CATA) program, which allows for rich and nuanced analysis of the data (Graebner, Martin, & Roundy, 2012).The chief downside of the qualitative approach, and the primary reason we adopted quantitative methods, is that hand-coding is a time-intensive process best-suited to relatively small datasets and corpora of text (Laver, Benoit, & Garry, 2003;Monaghan, Chater, & Christiansen, 2005).As described below, our dataset and research design produced a large corpus comprised of several million words and over three thousand web pages.It would have been very cumbersome to hand-code such a large dataset.Another advantage of quantitative text analysis approaches is that they are "hands-off" in that they rely on algorithms, not subjective perceptions, to identify common words and themes.

Data collection
Our data source was the 2016 "Forbes Best 100 Websites for Entrepreneurs." The "Forbes Best…" is a list of website selected annually (since 2013) by Forbes writers.The websites are selected for their: "ability to address a range of topics of interest to entrepreneurs.Frequent posts and content quality helps get a nod.The list is a combination of practical tools -sites to crowdsource funding like Rock The Post or AngelList, or sites with educational resources, like Stanford's eCorner -and inspirational advice from bloggers like Seth Godin and Steve Blank."(Forbes, 2013).
We chose the "Forbes Best…" list, rather than compiling our own list of websites, to limit idiosyncratic researcher (and academic) bias and because the Forbes list seemed to represent a broad range of entrepreneurial discourse (e.g., discourse about starting a venture, acquiring funding, selling, and scaling).Also, Forbes relied on nominations from the entrepreneurship community to compile the list, asking for websites "that can address a wide range of topics, like how to start up, establish your brand, build a bang-up team and secure that seemingly elusive round of capital" (Forbes, 2015).The fact that Forbes "crowdsourced" at least some of the list suggests that the list contains websites that are, in fact, important to entrepreneurs.Although there are other lists of "top entrepreneurship sites" (e.g., Entrepreneur.com's"8 successful online entrepreneurs you should be following"), the Forbes list was the most wide-reaching and comprehensive we could find.
In selecting the "Forbes Best …" list, we analyzed sites to ensure that they represented forums for entrepreneurial discourse.We ensured that entrepreneurship was the primary focus of the sites, rather than a niche interest.We also examined each site at different points in its history to ensure that the focus of the domain name had not changed.One of the reasons we ultimately selected the Forbes list is because most of the sites were structured as blogs (i.e., rather than reproducing a story from another source, each posting had an identifiable author with a point of view) and readers could comment on each posting, which allowed for two-sided, interactive communication (a dialogue).
We constructed a corpus of text by sampling discourse from each of the websites at two different dates, per year, for a 16-year period (2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016).Using the Internet Archive (www.archive.org)and its "Wayback Machine" feature, for each website two "snapshots" of the discourse content were captured from each year.A list of the uniform resource locators (URLs) for each site and each snapshot was generated.We then downloaded the web content into a Hadoop Distributed File System (HDFS) containing the text Volume 14, Issue 3, 2018: 127-158 Philip T. Roundy, Arben Asllani / from each site.The content of the websites was downloaded using the wget utility, defined as:

$ wget -l 2 -i url_list
where: • $ is the prompt in the Linux environment terminal; • wget is a freely-available utility for downloading files from the web that supports HTTP, HTTPS, and FTP protocols (i.e., the protocols that allow data communication on the web), and retrieval through HTTP proxies.wget is non-interactive, meaning that it can operate in the background of other operations.The command creates local versions of remote websites which are submitted to the HDFS for further processing; • -l 2 indicates level 2 inclusion in the download process.Level 1 of a URL represents the main page of the website and is normally named index.html.Level 2 represents the webpages that are linked to the main page; • -i indicates the input, which can be found in the file named url_list; • url_list is a text file containing the list of web page addresses from which the content should be downloaded.
We then created a MapReduce program to read the text between <body> and </body> tags in the index file of the website.Table 1 provides a summary of our data collection methodology.Overall, we downloaded 3,434 webpages spanning 2001 to 2016 and used this data for the text mining methodology.On average, 215 unique webpages (from the Fortune 100 Best websites) were downloaded each year.The number of webpages is not equivalent to the number of websites because, as described, we analyzed data two levels deep (i.e., the main page for each website and the pages linked to the main page).That is, for a year in which all of the Fortune 100 websites are available at least 200 webpages were analyzed (the 100 websites at two points during the year).Finally, the number of webpages analyzed per year increased over time (as more webpages became available in recent years); however, we normalized our findings by year totals.These methods generated a corpus of entrepreneurial discourse of over 3 million words (3.55 gigabytes of raw text).

Data analysis
After constructing the corpus of entrepreneurship discourse, our analysis consisted of two parts: (1) identifying the major themes and (2) charting the trends of themes over time.We began by modifying a MapReduce algorithm (Dean & Ghemawat, 2008) to count the frequency of each word in the corpus.The program also eliminated common words (e.g., "the," "and"), HTML tags, and other symbols.Figure 1 contains pseudo code for the MapReduce program.The MapReduce algorithm was executed in a Hadoop cluster with four nodes.The most frequently used words for each year were selected and processed to eliminate duplicates.We also created obvious groupings (e.g., combining words like knowledge and information into information) and identified words sharing the same stem (e.g., finance, financial, and financing).Table 2 contains the full list of 126 words used in the factor analysis described below.

Identification of themes
We used exploratory factor analysis (EFA; Fabrigar & Wegener, 2011) to identify themes in the most commonly occurring words in the corpus.Once we identified the most frequent keywords, we calculated the frequency index    of each key word i in webpage j as follows: where    is the frequency of keyword i in j and   is the total number of words in webpage j.To calculate    and   we ran the MapReduce algorithm for each full webpage, with the keyword list as an input to the program.

Figure 1. Modifi ed MapReduce program used to identi fy frequent words
The Kaiser-Meyer-Olkin (KMO) value of 0.70 indicates that our data is suitable for factor analysis (Cerny & Kaiser, 1977).Bartlett 's test of sphericity tests the hypothesis that the variables are unrelated and, thus, unsuitable for structure detecti on and factor analysis.A low signifi cance value (<0.001) indicates that factor analysis is, in fact, useful with our data (Snedecor & Cochran, 1989).
Table 4 contains the factor correlati on matrix.Five independent factors -themes -of entrepreneurship discourse were identi fi ed.Table 5 contains the strongest-loading words on each of the fi ve themes.In the factor analysis, words with loadings of .30and greater were retained (following the recommendati on of Brown, 2006).

FINDINGS
The study aimed to identify the key themes in entrepreneurship discourse and to examine if these themes changed over time.In the following sections, we describe the five most common themes and their main characteristics.

Marketing activities.
The most commonly occurring theme in entrepreneurship discourse, appearing in over 42% of websites included in the corpus (Figure 2), is comprised of keywords such as marketing, sales, and (customer) data.Given the focus of the words that loaded on this factor, we labeled this theme marketing activities.

Figure 2. The representati on of themes in entrepreneurship discourse
It is notable that the discourse of actual entrepreneurs refl ects the increasing academic emphasis on entrepreneurs' marketi ng practi ces.This theme indicates that while it is important for entrepreneurs to create cutti ng-edge products and technologies, entrepreneurs are increasingly doing so by adopti ng a customer-centric mindset and using strategies (like design thinking; Elsbach & Sti gliani, 2018) to understand consumers and gather customer data.
Technology-based entrepreneurship.The second most common theme in the corpus of discourse revolved around a cluster of words and phrases involving technology-based entrepreneurship.This theme appeared in over 38% of websites.The highest factor loadings in this category included words such as technology, soft ware, services (as in "cloud-based services" and "soft ware as a service"), and technology shift .

Innovation, Entrepreneurship and Organizations' Business Performance
Milena Ratajczak-Mrozek, Tibor Mandjak (Eds.)Individuals engaged in technology entrepreneurship assemble "resources and structures to exploit emerging technology opportunities" (Liu et al., 2005).Scholars acknowledge that technology entrepreneurship is not only a source of product innovation and technological advancement but serves as a potent mechanism for generating economic development (Bailetti, 2012).Findings suggest that technology entrepreneurship is also now a central theme in practitioner entrepreneurship discourse.

Digital entrepreneurship.
A distinct theme also emerged around digital entrepreneurship, which included words such as social (media), share, Facebook, and mobile.Digital entrepreneurship is a specific type of technology entrepreneurship focused on the pursuit of opportunities related to products and services based on digital media and other information technologies (Davidson & Vaast 2010: 2;Nambisan, 2017).This theme, which appeared in approximately 10% of websites in the corpus, includes the host of new business models being created around social media activities (cf.Hanna, Rohm, & Crittenden, 2011;Khajeheian, 2013) and corresponds to the digitalization of many industry sectors (Autio, Nambisan, Thomas, & Wright, 2018).

Professional investment.
Another theme is comprised of keywords, such as venture, capital, funds, and VC, and phrases like venture capital.Because of the shared focus of these words, we labeled this theme "professional investment."Professional investors, such as venture capitalists, are commonly-pursued by entrepreneurs as early-stage sources of funding that can complement (and come at a later stage than) other sources of startup funding, such as family and friends, angel investors, crowdfunding, and an entrepreneur's personal wealth (Ascher, 2012;Gompers & Lerner, 2001;Wong, Bhatia, & Freeman, 2009).The importance of early-stage professional investment in supporting the scaling of high-growth ventures makes it unsurprising that discussions about such investment are one of the primary themes of entrepreneurship discourse.In sectors in which entrepreneurs pursue exponential ("hockey stick") growth, such as internet technology, early-stage professional investment often represents a key source of funding that gives entrepreneurs access to the funds they need to develop their products, engage in R&D, hire a sales force, and create a marketing campaign (e.g., Davila, Foster, & Gupta, 2003).As Figure 2 illustrates, the venture capital theme was present in approximately 5% of discourse in the corpus.This percentage may reflect that, while professional investment is an important topic amongst some types of entrepreneurs, only a small Volume 14, Issue 3, 2018: 127-158 Philip T. Roundy, Arben Asllani / percentage of entrepreneurs are creating the types of fast-scaling ventures that need or can generate the type of returns that appeal to such investors.New venture entrepreneurship.A final theme was comprised of words, like "startup," which are a direct reference to new businesses and the creation of new organizations.Words associated with this theme were only present in less than 5% of the discourse, which might seem surprising given it is a corpus of entrepreneurship discourse; but there are at least two explanations for the theme's low frequency relative to other common themes.First, words that are directly related to the creation of new organizations, such as "new venture," might not need to be explicitly stated because the discourse was collected from entrepreneurship websites.In other words, there may be an implicit understanding that conversations are about activities involved in the creation of new firms and, thus, it is not necessary to overly use words like "startup" or "new venture" (e.g., articles about marketing challenges in new ventures, might simply refer to "marketing challenges" because the understanding is that the focus is new firms).
More subtly, the low prevalence of the new venture entrepreneurship theme, relative to the other themes, may reflect the fact that entrepreneurship is increasingly not confined to the creation of new organizations (Morris & Jones, 1999).Rather, contemporary definitions of entrepreneurship (and "entrepreneuring") emphasize that entrepreneurship is the creation of innovative organizations, products, or initiatives that create value (Nasution et al., 2011;Roundy, Bradshaw, & Brockman, 2018).Pursuing opportunities for innovations that produce value can be done outside the startup context, such as in established organizations (cf.work on corporate entrepreneurship; Kuratko, Hornsby, & Covin, 2014;Zarei, 2017), or as part of causes, movements, or other types of temporary organizations that do not require the establishment of formal (fully-incorporated) ventures (Burke & Morley, 2016).Entrepreneurship discourse reflects these broader views of entrepreneurial phenomena.

The evolution of themes in entrepreneurial discourse
To examine how the themes identified in the previous section changed over time, we calculated the average frequency index for each theme during a given year, as:

Figure 3. The evoluti on of themes in entrepreneurship discourse
The fi gure indicates that the fi ve themes can be further classifi ed into two superclusters consisti ng of marketi ng acti viti es and technologybased entrepreneurship, which during the span of the study were the most frequently-occurring themes in entrepreneurship discourse, and digital entrepreneurship, professional investment, and new venture creati on, which were less dominant (occurring in less than 20% of the corpus) but have a conti nuous (albeit slightly increasing) presence during the past 16 years.One way to interpret these fi ndings is that they indicate that marketi ng and technology are at the core of discourse about entrepreneurship while conversati ons about digital entrepreneurship, investment, and new venture acti vity are supplemental themes.Several additi onal trends emerge when examining the themes separately.For instance, "digital entrepreneurship" steadily increased from 2001 to 2010, Volume 14, Issue 3, 2018: 127-158 Philip T. Roundy, Arben Asllani / presumably as the social media sector grew in prominence.From 2010-2012, there was a steep increase in digital entrepreneurship discourse, which has since leveled off.One possible explanation for the plateauing of the theme is that as social media platforms like Twitter and Facebook have become ubiquitous, the creation of business models and innovations based on digital technologies became an accepted part of entrepreneurship and, hence, a theme in entrepreneurship conversations that receives less attention.Furthermore, it is intuitive that technology-based entrepreneurship is a more common theme over time than digital entrepreneurship because the former is a more general type of entrepreneurship that includes a wider range of business models, industries, and products.Similarly, marketing activities is a more commonly occurring theme than professional investment because all ventures must interact with customers, but a smaller percentage pursue (and receive) professional investment.Overall, entrepreneurs' language reflects what is occurring in both the startup community and the general marketplace.

DISCUSSION
The role of language in constructing and describing entrepreneurial activities is a topic receiving increased interest (cf.Clarke, Cornelissen, & Healey, in press;Spinuzzi, 2016).The theme-level content of entrepreneurship discourse is, however, not fully understood.Two overriding questions guided our study: what are the primary themes of entrepreneurship discourse?Moreover, how have these themes changed over time?Below, we summarize the answers we uncovered and examine the contributions and implications of our findings to scholars and practitioners.

Contributions to scholarship
Despite growing attention to the discourse of entrepreneurs, we know surprisingly little about the specific themes that constitute their language.In this study, we identify the five most common themes in entrepreneurship discourse (marketing activities, technology entrepreneurship, digital entrepreneurship, professional investment, and new venture entrepreneurship) during the past 16 years.In doing so, we uncover, arguably, the most frequently discussed topics among entrepreneurs and the issues that they are giving the greatest attention.By creating a corpus from a range of national and international websites (from the Forbes Best 100 Websites for Entrepreneurs), we were able to identify the key themes in general entrepreneurship discourse, rather than focusing on the discourse tied to a specific subset of entrepreneurs, organizations, or industries.We were also able to approach the analysis Innovation, Entrepreneurship and Organizations' Business Performance Milena Ratajczak-Mrozek, Tibor Mandjak (Eds.)without a priori assumptions about what themes are most important to practicing entrepreneurs.By identifying the word-and phrase-level patterns that create distinct themes in entrepreneurship language, we make several conceptual and empirical contributions to entrepreneurship research.
First, our findings provide empirical support for intuitive trends in entrepreneurship, such as the rise of technology and digital entrepreneurship.To the extent that entrepreneurship discourse both reflects and helps to construct what is given attention (e.g., Logan, 1999), the themes we identify represent the issues that entrepreneurs devote most of their attention to discussing.Related to this point, the findings also call into question whether the concepts receiving the most attention from scholars are the main topics comprising entrepreneurship discourse.For most of the themes, there is alignment between the existence of a robust stream of academic research and a vibrant practitioner discourse (e.g., technology entrepreneurship; professional investment; new venture entrepreneurship).
However, for two themes -marketing activities (in an entrepreneurship context) and digital entrepreneurship -the academic literature seems to be lagging practitioner discussions, which suggests that more research is needed on these aspects of entrepreneurship.For instance, the stream of research that has developed at the marketing and entrepreneurship "interface" (e.g., Hills & Hultman, 2011), the creation of academic organizations focused on this topic (e.g., the Entrepreneurial Marketing special interest group (SIG) in the American Marketing Association), and the scholarly events dedicated to marketing issues in entrepreneurship (e.g., the Global Research Symposium on Marketing and Entrepreneurship), are all making in-roads in drawing attention to the importance of marketing in entrepreneurial activities.However, in many respects, this research is still considered a "niche" topic within the broader academic conversation about entrepreneurship.Our findings suggest that marketing issues are front-and-center in practitioner discourse and should occupy a more central position in academic conversations.
Furthermore, it is useful to think about what the two dominant themes in entrepreneurship discourse -technology entrepreneurship and marketingrepresent.On a deeper level, the creation of new technologies is core to what entrepreneurs do and represents a primary form of "value creation" (e.g., Lepak, Smith, & Taylor, 2007).The introduction, development, and delivery of innovative technologies is central to the function that entrepreneurs serve in the marketplace.However, for entrepreneurs to be financially viable, they must also engage in "value capture" (Fayolle, 2007) consumption" (Priem, 2007, p. 220).Marketing activities are key to capturing value (Mizik & Jacobson, 2003).Thus, the dominant themes in entrepreneurship discourse reflect the two guiding logics -value creation and value capture -that entrepreneurs must manage. 3n interesting, although counter-intuitive, finding is the lack of evidence in practitioner discourse for some of the main themes in entrepreneurship research.Most notably, the topic of "opportunity," and the examination of how entrepreneurs construct, discover, and develop new opportunities, is one of the most intensely researched topics in the entrepreneurship discipline (cf.Short, Ketchen, Shook, & Ireland, 2010).The word opportunity (and its variants), however, did not load on any of the five main themes we identified.There are at least two explanations for this result.First, opportunity may be a concept so pervasive in entrepreneurship, and so fundamental to the phenomenon, that entrepreneurs do not find it necessary to draw explicit attention to it.If so, then there is an unstated assumption among entrepreneurs that most conversations involve some aspect of turning an opportunity into a viable business.In contrast, "opportunity" may instead be a concept that scholars devote significant time to understanding while entrepreneurs focus on more concrete topics and practices (Gartner, Stam, Thompson, & Verduyn, 2016).Entrepreneurs may not spend time thinking and discussing concepts like opportunity because they are viewed as ethereal and not directly involved in day-to-day entrepreneurial activities.Our findings suggest that research is needed to examine the degree to which the opportunity concept plays a role in the practices of entrepreneurs.
The prevalence of the "digital entrepreneurship" theme, particularly post-2010, suggests that scholars should devote more attention to the growing digital infrastructure (Nambisan, 2017) and how it is changing entrepreneurial activities.For instance, research is needed on how entrepreneurs harness "technological affordances (Gibson, 1977) created by digital technologies and infrastructures," and how the digitalization of the economy represents an "economy-wide redesign of value creation, delivery, and capture processes" (Autio et al., 2018: 74).At the same time, scholars should be attuned to changes in the tenor of entrepreneurial (and consumer) discourse about digitization as there may be a growing dialogue about the negatives of the digitalization of society and a developing counter-cultural movement away from digital to analog (e.g., Sax, 2016).Overall, our findings contribute to entrepreneurship research by serving as a reminder that scholars should be aware of the main themes in discourse about entrepreneurship to ensure that their research has some relevance to practitioners (cf.Vermeulen, 2007).

Innovation, Entrepreneurship and Organizations' Business Performance
Milena Ratajczak-Mrozek, Tibor Mandjak (Eds.) Our study also has methodological implications.Most research on entrepreneurship and discourse employs qualitative methods, such as interviewing and ethnographic observation, and utilizes small samples comprised of entrepreneurs from the same organization, industry, or geographic area.Our findings illustrate the use of quantitative, computer automated text analysis (CATA) and a "Big Data" approach (Asllani, 2014).Our methodology allowed us to construct a broadly-representative corpus of entrepreneurship discourse comprised of over 3 million words and over 3000 unique webpages.To the best of our knowledge, we are the first scholars to use this type of methodology in the context of entrepreneurship discourse.Our methods, which we describe in detail and can be followed by other researchers, represent an innovative approach to analyzing entrepreneurs' language.

Implications for practitioners
Research examining entrepreneurship discourse consistently finds that the language entrepreneurs use to conceptualize and describe their ventures matters.Language is not merely a reflection of cognition or behaviors; it can shape thinking and action (Lewis, 1966).For this reason, if entrepreneurs want to participate in conversations about entrepreneurship (e.g., when pitching their ventures or when gathering information from other members of their entrepreneurial ecosystem; Roundy, 2016), it is important for them to be aware of the main themes in entrepreneurship discourse so that they can tailor their language accordingly.
The content of the specific themes we identify also has implications for entrepreneurs.For example, entrepreneurs should acknowledge the important role played by marketing and what can be gained by taking a consumer perspective.Although this might seem like an obvious insight, many entrepreneurs, because of their backgrounds in non-business disciplines such as engineering and computer science, adopt a product-rather than customer-focus (Rosen, Schroeder, & Purinton, 1998).However, as evidenced by the high frequency of discussions about marketing and consumer activities, entrepreneurs are devoting an increasing amount of their discourse to marketing issues.At the same time, even though it was one of the least common of the five primary themes, discussions about professional investment still appeared in between 5% and 18% of website discourse.Given the extremely small percentage of firms that qualify for and receive professional investment (cf.Rao, 2013), this theme may actually be over-represented in entrepreneurs' conversations.That is, entrepreneurs may be too concerned with discussing "how to attract venture capital" rather than pursuing other funding options such as bootstrapping or crowdfunding (e.g., Belleflamme, Lambert, & Schwienbacher, 2014).Thus, entrepreneurs could use our findings to assess Volume 14, Issue 3, 2018: 127-158 Philip T. Roundy, Arben Asllani / what they are spending their time discussing and to assess whether other topics should be the focus of their attention and discourse.

Limitations and directions for future research
Despite the contributions of our research, it was not without limitations, which serve as directions for future research.First, our sample was comprised entirely of discourse from entrepreneurship websites.Although our sample produced a large corpus, it is not exhaustive of all types of entrepreneurship.Thus, while the corpus is representative of larger conversations about entrepreneurship, there may be some groups that are not part of these conversations.For example, there are some types of entrepreneurs, such as traditional small business entrepreneurs, that may be less likely than entrepreneurs who are growing rapidly-scaling ventures to take part in the discussions of the websites we examined.Furthermore, the "Forbes Best 100…" list is only a sample of global entrepreneurship discourse and has the limitation of only representing English-speaking journals.Research is needed examining the discourse of entrepreneurs outside the Western context.
In addition, as we have noted, our corpus is comprised of discourse from practitioners and does not reflect academic discourse about entrepreneurship.An important direction for future studies is formally analyzing the extent to which discourse contained in scholarship about entrepreneurship is lagging (or leading) practitioner entrepreneurship discourse.To explore this issue, researchers could create a corpus, similar to the one constructed for this study, but comprised of a collection of academic entrepreneurship articles from the same period as our study (e.g., all articles published in a particular journal or set of journals).Our text mining methodology could then be used to identify the main themes in academic entrepreneurship discourse to determine how they have changed over time and how much scholarly discourse matches or diverges from practitioner discourse.
An additional avenue for future research is to go beyond examining themes to analyze the deeper-level linguistic characteristics of entrepreneurship discourse.For example, CATA software, such as the Linguistic Inquiry and Word Count (LIWC) program, could be used to examine the social and psychological properties of entrepreneurial discourse, including its emotionality and concreteness (cf.Pennebaker et al., 2001).

t
Figure 3 represents the frequency of each theme during the 2001-2016 period.
, which involves "the appropriation and retention by the firm of payments made by consumers in expectation of future value from Journal of Entrepreneurship, Management and Innovation (JEMI), Philip T. Roundy is the UC Foundation Assistant Professor of Entrepreneurship and Summerfield Johnston Centennial Scholar at the University of Tennessee at Chattanooga.He earned his Ph.D. in strategic management and organization theory at the University of Texas at Austin.His research interests center on social entrepreneurship, entrepreneurial ecosystems, and the role of entrepreneurship in economic development and community revitalization.His work has appeared in Strategic Organization, Journal of Management Studies, Journal of Business Venturing Insights, Academy of Management Perspectives, Journal of Business Research, Journal of Entrepreneurship, and others.He serves on the editorial boards of Journal of Business and Entrepreneurship and Journal of Applied Management and Entrepreneurship.Arben Asllani is the Marvin E. White Professor of Management at the University of Tennessee at Chattanooga.He earned his Ph.D. in management information systems and operations management at the University of Nebraska.He is a recognized author, scholar, teacher, and consultant in the areas of business analytics, cybersecurity, information systems, and management science.His work has appeared in Omega, Transfusion, European Journal of Operational Research, Knowledge Management, Computers & Industrial Engineering, Total Quality Management & Business Excellence, and others.He is also the author of "Business Analytics with Management Science Models and Methods," published by Pearson/FT.

Innovation, Entrepreneurship and Organizations' Business Performance Milena Ratajczak-Mrozek, Tibor Mandjak (Eds.)Table 1 .
Summary of data collection and analysis steps

Table 3 .
Model validity for factor analysis

Table 4 .
Factor correlation matrix Note: Extraction Method: Principal Axis Factoring; Rotation Method: Promax with Kaiser normalization.