QUESTION 1

The TFIDF (or TF-IDF) is a measure that considers both  ______________ and ________________.

A.commonness of a term,  the scarcity of the term 

B.uncommoness of a term,  the scarcity of the term 

C.length of a term,  the scarcity of the term  

D.uncommoness of a term,  the weakness of the term  

5 points  

QUESTION 2

The term __________ refers to a specific implementation of association rules mining that many companies use for a variety of purposes.

A.market research analysis

B.market prediction analysis

C.market competitive analysis

D.market basket analysis

5 points  

QUESTION 3

A distribution over a fixed vocabulary of words is formally defined as..

A.subject

B.topic

C.story

D.text line

5 points  

QUESTION 4

Your customer provided you with 3,000 unlabeled records and asked you to separate them into three groups. What is the correct analytical method to use?

A.K-means clustering

B.Naive Bayesian classification

C.Linear regression

D.Logistic regression 

5 points  

QUESTION 5

Time series analysis attempts to model the underlying structure of ________________taken over time.

A.observation

B.patterns

C.solution

D.facts

5 points  

QUESTION 6

Which of the following algorithm are not an example of ensemble learning algorithm?

A.Random Forest

B.Adaboost

C.Gradient Boosting

D.Decision Trees  

5 points  

QUESTION 7

What is the difference between supervised learning and unsupervised learning?

A.Supervised learning algorithms work on data which are labelled. On the other hand, unsupervised learning algorithms work on unlabeled data.

B.Supervised learning algorithms work on data which are unlabelled. On the other hand, unsupervised learning algorithms work on labeled data.

C.Supervised learning algorithms work on raw data.  On the other had, unsupervised learning algorithms work on process data.

D.None of these

5 points  

QUESTION 8

  1. The ____________ is the most iterative one and the one that teams tend to underestimate the amount of effort involved.
  2. Discovery Phase
  3. Model Building Phase
  4. Operationalization Phase
  5. Data Preparation Phase

5 points  

QUESTION 9

How many levels does fdata contain in the following R code. data = c(1,2,2,3,1,2,3,3,1,2,3,3,1), fdata = factor (data)

A.2,3,2

B.1,2,3

C.5,3,1

D.1,2,6

5 points  

QUESTION 10

A _______ is a table-like data structure available in languages like R and Python

A.data frame

B.data file

C.data table

D.database

5 points  

QUESTION 11

Which of the following are Measures of Central Tendency?

A.Mean,Range, Mode

B.Mean, Standard Deviation, Range

C.Mode, Mean, Median

D.Range, Standard Deviation, Variance  

5 points  

QUESTION 12

  1. ________________________ is a probabilistic classification method based on Bayes’ theorem.

A.Naive function

B.Naive process

C.Naive Bayes

D.None of these

5 points  

QUESTION 13

  1. What is a type I error? What is a type II error? Is one always more serious than the other? Why?

5 points  

QUESTION 14

  1. The ______________ function builds a model of recursive  partitioning  and regression tree and have four parameters.

A.lpart()

B.mpart()

C.rpart ()

D.None of these

5 points  

QUESTION 15

In least squares regression, which of the following is not a required assumption about the error term ?

A.The expected value of the error term is one.

B.The variance of the error term is the same for all values of x.

C.The values of the error term are independent.

D.The error term is normally distributed. 

5 points  

QUESTION 16

During the Model Building phase, the team builds and executes _____________________________________________.

A.The models base on the work done in the Planning phase

B.The business requirement provided from business analyse

C.the models base on the work done in the Model Planning phase

D.None of the above

5 points  

QUESTION 17

Which of the following is the most important language in Data Science

A.C#

B.Java

C.Ruby

D.R

5 points  

QUESTION 18

Your organization has a website where visitors randomly receive one of two coupons. It is also possible that visitors to the website will not receive a coupon. You have been asked to determine if offering a coupon to visitors to your website has any impact on their purchase decision. Which analysis method should you use?

A.One-way ANOVA

B.K-means clustering 

C.Association rules

  1. D.Student T-test

5 points  

QUESTION 19

How many steps does a text analysis problem consist of 

A.2

B.1

C.3

D.4

5 points  

QUESTION 20

A time series can consist of all of the following components except:

A.Time lapse

B.Trend

C.Cyclic

D.Seasonality

5 points  

QUESTION 21

Additional time series methods include all of the following except which one.

A.Autoregressive Moving Average with Exogenous inputs (ARMAX)

B.Spectral analysis

C.Kalman filtering

D.Single variable time series filtering

5 points  

QUESTION 22

The goal of POS tagging is to ______ whose input is a sentence.

A.build a text file 

B.build a model 

C.build a database

D.build a text graph  

5 points  

QUESTION 23

What happens in the final Operationalize phase? 

A.Requirements are gathered

B.The team delivers final reports, briefings, code, and technical documents. They may also run a pilot project to implement the models in a production environment.

C.The team delivers draft reports, draft briefings, code, and some technical documents. They may also run a pilot project to implement the models in a production environment.

D.None of the above

5 points  

QUESTION 24

What can be done if during the Discovery Phase the team decides that the available data is insufficient?

A.Cancel the project

B.Collect Additional Data

C.Work with what you already have

D.Do nothing

5 points  

QUESTION 25

In regression, the equation that describes how the response variable (y) is related to the explanatory variable (x) is

A.the correlation model

B.the regression model

C.used to compute the correlation coefficient

D.None of the above

5 points  

QUESTION 26

In chapter 8, a time series consists of an __________________ sequence of equally spaced values over time.

A.Unordered

B.Bilateral

C.ordered

D.lateral

5 points  

QUESTION 27

One advantage of ARIMA modeling is that the analysis can be based on _________________________for the variable of interest.

A.future time series data

B.historical time series data

C.historical time lapse data

D.None of the above

5 points  

QUESTION 28

What are the ‘resources’ being assessed in the Discovery Phase?

Cloud Resources

The business environment and business partners resources

Technology, Tools, Systems, Data, and People

None of the above

5 points  

QUESTION 29

Suppose you are using a bagging based algorithm say a RandomForest in model building. Which of the following can be true?

  • Number of tree should be as large as possible
  • You will have interpretability after using RandomForest

A.1

B.2

C.1 and 2

D.None of these   

5 points  

QUESTION 30

 HDFS block size is larger as compared to the size of the disk blocks so that _____________________

A.Only HDFS files can be stored in the disk used.

B.The seek time is maximum

C.Transfer of a large files made of multiple disk blocks is not possible.

D.A single file larger than the disk size can be stored across many disks in the cluster.

5 points  

QUESTION 31

The IDF inversely corresponds to the ______________________ , which is defined to be the number of documents in the corpus that contain a term.

A.document frequency (DF)

B.directory frequency (DF)

C.docker frequency (DF)

D.None of the above

5 points  

QUESTION 32

The arima () function in R uses ___________________________________ to estimate the model coefficients.

A.Maximum Likelihood Estimation (MLE)

B.Mini Likelihood Estimation (MLE)

C.Minimum Likelihood Estimation (MLE)

D.Mining Likelihood Estimation (MLE)

5 points  

QUESTION 33

R functionality is divided into a number of ________

A.Stored Procedures

B.Functions

C.Domains

D.Packages

5 points  

QUESTION 34

A __________________________is a simple and widely used visualization for finding the relationship among multiple variables and can represent data with up to five variables.

A.scatterplot

B.Dotchart and Barplot

C.Straight Plot

D.Box-and-Whisker Plot

5 points  

QUESTION 35

Which of the following R function can best provide descriptive statistics, such as the mean and median, about a variable as the sales data frame.

A.ggplot2 ()

B.dplyr ()

C.stringr ()

D.summary ()

5 points  

QUESTION 36

Your colleague, who is new to Hadoop, approaches you with a question. They want to know how best to access their data. This colleague has a strong background in data flow languages and programming. Which query interface would you recommend?

A.Howl

B.Pig

C.Hive

D.HBase 

5 points  

QUESTION 37

Many quantitative analysts use R as their____tool?

A.Leading tool

B.Programming tool

C.Primary Tool

D.All of the above

5 points  

QUESTION 38

In R, the ___________________ function creates a time series object from a vector or a matrix. 

A.ts ()

B.tk ()

C.ttime ()

D.plot()

5 points  

QUESTION 39

According to your text book, Chapter 4. clustering analysis groups _______________objects based on the objects’ __________.

A.similarity , cost

B.position, similarity

C.similarity, attributes

D.rank, attributes

5 points  

QUESTION 40

You have run the association rules algorithm on your data set, and the two rules {banana, apple} => {grape} and {apple, orange}=> {grape} have been found to be relevant. What else must be true?

A.{banana, apple, grape, orange} must be a frequent itemset.

B.{banana, apple} => {orange} must be a relevant rule.

C.{grape} => {banana, apple} must be a relevant rule.

D.{grape, apple, orange} must be a frequent itemset. 

5 points  

QUESTION 41

In regression analysis, the variable that is being predicted is the

A.Response, or dependent variable 

B.Independent variable 

C.Intervening variable

D.Usually X

5 points  

QUESTION 42

Which function is used to create the vector with more than one element?

A.Library()

B.plot()

C.c()

D.par()

5 points  

QUESTION 43

During the Model Building phase, the team develops _____________________________

A.Data application for testing, training, and production purposes

B.Datasets for testing, training, and production purposes

C.Datasets for prediction, training, and production purposes

D.Datasets for testing, training, and development purposes

5 points  

QUESTION 44

A data analysis must know when to pick the most suitable method for a given classification problem. When there is nonlinear data or discontinuities in the input variables that would affect the out, the best method choice to choose would be

A.Simple Regression

B.Naive Bayes

C.Logistic programming

D.Decision Tree

5 points  

QUESTION 45

Vectors come in two parts_____ and _____.

A.Atomic vectors and list

B.Atomic vectors and matrix

C.Atomic vectors and array

D.None of the above

5 points  

QUESTION 46

Which of the following is performed by Data Scientist?

A.Define the question

B.Create reproducible code

C.Challenge results

D.All of the above mentioned  

5 points  

QUESTION 47

Regression modeling is a statistical framework for developing a mathematical equation that describes how

A.one explanatory and one or more response variables are related

B.several explanatory and several response variables response are related 

C.one response and one or more explanatory variables are related 

D.All of these are correct.

5 points  

QUESTION 48

Text analysis, sometimes called text analytics, refers to the ___________,________ , and __________ of textual data to derive useful insights

A.representation,  processing , and  modeling

B.representation,  processing , and  designing

C.presentation,  processing , and  modeling

D.presentation,  storing , and  modeling

5 points  

QUESTION 49

What is a major difference between BI and Analytics?

A.Analytics has predictive capabilities whereas BI helps in informed decision-making based on analysis of past data

B.Analytics has no predictive capabilities whereas BI helps in informed decision-making based on analysis of past data 

C.Analytics has is not always reliable whereas BI helps in informed decision-making based on analysis of past data 

D.None of the above

5 points  

QUESTION 50

A ___________________________is a specific table layout that allows visualization of the performance of a  classier.

A.positive matrix

B.true matrix

C.confusion matrix

D.false positive rate matrix

5 points  

QUESTION 51

Which of the following are correct.

A.Raw data is original source of data

B.Preprocessed data is original source of data

C.Raw data is the data obtained after processing steps

D.None of the mentioned 

5 points  

QUESTION 52

Unsupervised learning is where you only have input data (X) and ______________________output variables.

A.two corresponding

B.three corresponding

C.no corresponding

D.two or more

5 points  

QUESTION 53

Which of the following is not  a critical characteristic of Big Data?

A.Velocity

B.Volume

C. Variety

D. Value

5 points  

QUESTION 54

What of the following are identified in the Communicate Results phase?

A.Key findings

B.A quantification of the business value

C.A narrative to summarize and convey findings to stakeholders

D.All of the above

5 points  

QUESTION 55

Which SQL function is used to count the number of rows in a SQL query?

A.COUNT()

B.NUMBER()

C.SUM()

D.COUNT(*) 

5 points  

QUESTION 56

Which of the following sort dataframe by the order of the elements in B

A.a.x[rev(order(x$B)),]

B.b.x[ordersort(x$B),]

C.c.x[order(x$B),]

D.None

5 points  

QUESTION 57

Which of the following is not a good practical use of Big Data Analytics?

A.Location Tracking

B.Precision Medicine

C.Customer Discrimination

D.Fraud Detection & Handling

5 points  

QUESTION 58

How many types of R objects are present in R data type?

A.1

B.17

C.4

D.6

5 points  

QUESTION 59An ______ is a workspace that is typically isolated from production applications and warehouse environments.

A.Private Environment

B.Sand Environment

C.Sandbox

D.Production box

5 points  

QUESTION 60

Data smoothing in predictive analytics is, essentially, trying to find the signal in the noise by discarding data points that are considered noisy. There are various smoothing techniques. Which of the following is NOT a smoothing  technique.

A.Laplace smoothing

B.Kneser-Ney smoothing

C.Katz smoothing

D.Statistical smoohting

5 points  

QUESTION 61

All of the following are the steps followed during a text analysis problem:

A.searching, parsing, and retrieval, and text mining

B.saving, parsing, and retrieval, and text mining

C.searching, parsing, and retrieval, and text saving 

D.searching, parting, and retrieval, and text mining

5 points  

QUESTION 62

Unlike Pig and Hive, which are intended for _____________ , Apache HBase is capable of providing _________________ and write access to datasets with billions of rows and millions of columns.

A.batch application, historic data read

B.batch application, real-time data read

C.batch application, speed data read

D.batch application, deep data read

5 points  

QUESTION 63

When writing SQL query statements, a RIGHT OUTER JOIN is used t specify that _____________________________from the table on the right-handside of the join, should be returned, regardless of whether there is a matching record found.

A.some rows

B.partial rows

C.all rows

D.all columns

5 points  

QUESTION 64

The conditional probability of event  C occurring, given then event A has already occurred, is denoted as 

A.P(A| C)

B.P(A| A) C

C.P(C| C) A

D.None of these

Your initial outline for your course project paper is due this module. Prepare a 1-2-page document that outlines how you will organize your course project paper. Your outline will be the skeleton from which you will write your project. Your outline should contain an idea for your introduction (the full introduction will be created in Module 03) and at least 3 headings for sections  that explain and analyze how technology has been used to improve healthcare delivery and information management for your selected topic (from Module 1), as well as implications, challenges, risks, and opportunities. You may use any standard outline format. Be sure to use correct grammar and spelling.

Click here to find out What does a good outline look like?:

 

Link and combine all the different pages created in this course into one complete website with database connection. The first page a user should see when they get to your club site is the login page where they will be asks to login or register. If the user does not have a login they should be able to click on the registration link and be taken to the registration form where they are able to register. After registering, the user should be directed back to the login page where they can log into the main club site. On the main club site they should be given information about the site and the option to get members information from the database.

Export your database using the export your database tutorial and zip it along with all parts of your club website to upload for grading.

  

ATTACHED you will find a live performance agreement. Review the gig agreement.and answer some questions about it. 

Follow the flow of the Agreement and tell us what you see as the questions focus your attention.  Answer the questions directly on the paper.

 

Explore the EEOC website to learn more about the organization.

Click the About the EEOC link and select Newsroom. Select a press release about an employee lawsuit published within the last six months.

Search the Internet to find at least one news item about this lawsuit, preferably from a news source in the state in which the incident occurred.

Write a 1,050- to 1,400-word paper that includes the following:

  • A description of the compliance issue that led to the lawsuit and its ramifications for the organization
  • A brief summary of the functions of the EEOC in one paragraph
  • The EEOC’s role in this lawsuit
  • Whether or not this lawsuit promotes social change; justify your reasoning
  • A comparison of the EEOC press release to the news item. What accounts for the differences?
  • Strategies you would implement, if you were a senior manager of this company, to ensure future compliance and inclusion in the multicultural workplace

Cite your sources and the textbook Understanding and Managing Diversity.

Format your paper according to appropriate course-level APA guidelines.

Submit your assignment.

Present an oral summary of your case to your peers during our week four class.

   

Whatever (in x, y; out z)

begin

x  <— x * 2;

if x > 0 then

y <— y + 200;

else

y <— 5 * y;

endif;

while x > y do

x <— x 10;

endwhile;

z <— y x;

end;

  

a. List and briefly discuss three goals of performing software testing. Why do we test software systems?

b. Describe briefly how you would apply Functional (Black Box) Testing to this subroutine. Why would you perform functional testing?

c. Describe briefly how you would apply Structural (White Box) Testing to this subroutine. Why would you perform structural testing?

 

  1. Business-Level and Corporate-Level Strategies
    Overview
    In this assignment, you are to use the same corporation you selected and focused on for Assignment 1: Strategic Management and Strategic Competitiveness and Assignment 2: External and Internal Environments.
    Research the company on its own website, the public filings on the Securities and Exchange Commission , the University’s , the , and any other sources you can find. The annual report will often provide insights that can help address some of these questions.
    Requirements
    Write a six- to eight-page paper in which you do the following:
    • Analyze the business-level strategies for the corporation you chose to determine the business-level strategy you think is most important to the long-term success of the firm and whether or not you judge this to be a good choice. Justify your opinion.
    • Analyze the corporate-level strategies for the corporation you chose to determine the corporate-level strategy you think is most important to the long-term success of the firm and whether or not you judge this to be a good choice. Justify your opinion.
    • Analyze the competitive environment to determine the corporation’s most significant competitor. Compare their strategies at each level and evaluate which company you think is most likely to be successful in the long term. Justify your choice.
    • Determine whether your choice from Question 3 would differ in slow-cycle and fast-cycle markets.
    • Use at least three quality references. Note: Wikipedia and other websites do not quality as academic resources. 
    • Your assignment must follow these formatting requirements:
    • This course requires use of new (SWS). The format is different than other Strayer University courses. Please take a moment to review the SWS documentation for details.
    • Be typed, double spaced, using Times New Roman font (size 12), with one-inch margins on all sides; references must follow SWS or school-specific format. Check with your professor for any additional instructions.
    • Include a cover page containing the title of the assignment, the student’s name, the professor’s name, the course title, and the date. The cover page and the reference page are not included in the required page length. 
    • Use the to ensure that your assignment meets the above requirements.
      The specific course learning outcome associated with this assignment is as follows:
    • Determine business-level and corporate-level strategies for a corporation’s long-term success comparable to the competitive environment.
    • Grading for this assignment will be based on answer quality, logic and organization of the paper, and language and writing skills, using the scoring rubric.
  2. By submitting this paper, you agree: (1) that you are submitting your paper to be used and stored as part of the SafeAssign services in accordance with the ; (2) that your institution may use your paper in accordance with your institution’s policies; and (3) that your use of SafeAssign will be without recourse against Blackboard Inc. and its affiliates.

1.

Defining Modernism

Read, review, and annotate Howe’s “The Culture of Modernism” article with the intention of defining modernism and/or answering the questions: what are the characteristics of modernist art and literature and what is the historical context and influences on the culture and artistic output during the modernist period? Make notes in an evidence/interpretation log, like this:

CharacteristicEvidence Source

disturbing “chooses subjects that disturb the audience and threaten its most cherished sentiments

Outlining at least 5 characteristics of modernist art and literature, including values, aesthetics, and historical context. Tell us what page in Howe’s article you are drawing this from. 

Let’s try to get a comprehensive list with as little repetition as possible; use evidence from all pages of the article. We will use these responses through the rest of this module and your Modernism Essay.

2.

Analyzing Modernist Literature 1: Hemingway

Read Hemingway’s “A Clean, Well-Lighted Place and annotate it, looking for aspects that match Howe’s definition of modernism.

A) List three examples of how you think this story fits the definition with specific quotations from the story. Use an MLA in-text citation to identify the source of the quotation. Also identify the page on which this characteristic appears in Howe’s article. For example: Howe defines modernist literature as “dark” and “disturbing” (1); we see this in Hemingway’s story when the waiter discuss the old man’s attempted suicide (3). (Don’t use these examples.)

B) Explain any aspect of the story that you think doesn’t fit our definition of modernism

  

HOMEWORK ASSIGNMENT #1 How we read a contract so that we can understand what is being said ?  It has everything to do with knowing the language of the music business. It is not a foreign language. You are probably familiar with many of the words. HOW the words are USED IN THE MUSIC BUSINESS is very important.

Lesson plan can be in the form of a lesson plan template, Word document, or PowerPoint.

Lesson Plan-Online 

             

Design a lesson plan on one subject from Topic 1 or 2 provided in the topics list from your instructor. Include the following:

  1. Overview. Write an introduction to the class activity. Include the purpose of the activity and desired outcome.
  2. Objectives. The objectives should be specific and measurable.
  3. Time. How long will the activity take when implemented in the classroom?
  4. Materials. Describe any materials that are needed to conduct the lesson.
  5. Activity. Provide a detailed description of the activity. Write all steps from the instruction of the assessment.

Your lesson plan may be in any form approved by the instructor.

GCU style is not required, but solid academic writing is expected.

Refer  to Lesson Plan Scoring Guide, prior to beginning the assignment to  become familiar with the expectations for successful completion.