Sie suchten nach: balancing (Englisch - Persisch)

Computer-Übersetzung

Versucht aus den Beispielen menschlicher Übersetzungen das Übersetzen zu lernen.

English

Persian

Info

English

balancing

Persian

 

von: Maschinelle Übersetzung
Bessere Übersetzung vorschlagen
Qualität:

Menschliche Beiträge

Von professionellen Übersetzern, Unternehmen, Websites und kostenlos verfügbaren Übersetzungsdatenbanken.

Übersetzung hinzufügen

Englisch

Persisch

Info

Englisch

left/ right balancing

Persisch

مخلوط‌کنها

Letzte Aktualisierung: 2011-10-23
Nutzungshäufigkeit: 1
Qualität:

Englisch

self-balancing binary search tree

Persisch

درخت جستجوی دودویی خود-متوازن

Letzte Aktualisierung: 2014-11-10
Nutzungshäufigkeit: 1
Qualität:

Referenz: Wikipedia

Englisch

with balancing the joint checking account .

Persisch

موازنه کردن . حساب هاي مشترک کار ميکنيم .

Letzte Aktualisierung: 2011-10-24
Nutzungshäufigkeit: 1
Qualität:

Referenz: Wikipedia

Englisch

rabbi jackie tabick: the balancing act of compassion

Persisch

جَکی تَبیک: عملِ توازنِ دل سوزی و رحم

Letzte Aktualisierung: 2015-10-13
Nutzungshäufigkeit: 1
Qualität:

Referenz: Wikipedia

Englisch

hassan rouhani's iranian nuclear balancing act · global voices

Persisch

دومین سفر حسن روحانی در مقام رئیس جمهور ایران به نیویورک برای شرکت در مجمع عمومی سازمان ملل، بدون دست آورد قابل ملاحظه‌ای در مذاکرات جامع اتمی ایران و ۵+۱ و در راس آنها آمریکا، پایان یافت.

Letzte Aktualisierung: 2016-02-24
Nutzungshäufigkeit: 1
Qualität:

Referenz: Wikipedia

Englisch

==design considerations==one key aspect of pipeline design is balancing pipeline stages.

Persisch

یکی از جنبه‌های کلیدی طراحی خط لوله برقراری تعادل در مراحل موازی سازی است.

Letzte Aktualisierung: 2016-03-03
Nutzungshäufigkeit: 1
Qualität:

Referenz: Wikipedia

Englisch

avl trees and red-black trees are both forms of self-balancing binary search trees.

Persisch

درخت ای‌وی‌ال و درخت سرخ-سیاه از جمله درختان دودویی خودمتوازن هستند.

Letzte Aktualisierung: 2016-03-03
Nutzungshäufigkeit: 1
Qualität:

Referenz: Wikipedia

Englisch

the reason for this is the more complex balancing required for spatial data as opposed to linear data stored in b-trees.

Persisch

دلیل این مورد، پیچیده تر بودن متوازن سازی برای داده‌های مکانی در برابر داده‌های خطی ذخیره شده در درخت b است.

Letzte Aktualisierung: 2016-03-03
Nutzungshäufigkeit: 1
Qualität:

Referenz: Wikipedia

Englisch

and balancing them all in the middle is this notion of compassion, which has to be there, if you like, at our very roots.

Persisch

و نظریه ی شفقت، تعادلِ همه ی آنها در میانه است. که الزاماً در ریشه ی ما باید باشد.

Letzte Aktualisierung: 2015-10-13
Nutzungshäufigkeit: 1
Qualität:

Referenz: Wikipedia

Englisch

the second feedback loop on the left is negative reinforcement (or "balancing" and hence labeled b).

Persisch

حلقهٔ بازخورد دوم در سمت چپ، تقویت منفی (که با b برچسب‌زده شده) است.

Letzte Aktualisierung: 2016-03-03
Nutzungshäufigkeit: 1
Qualität:

Referenz: Wikipedia
Warnung: Enthält unsichtbare HTML-Formatierung

Englisch

being carbon neutral refers to achieving net zero carbon emissions by balancing a measured amount of carbon released with an equivalent amount sequestered or offset, or buying enough carbon credits to make up the difference.

Persisch

خنثی بودن از لحاظ کربن به معنای دستیابی به انتشار کربن صفر بوسیلهٔ تعادلی میان میزان کربن اندازه‌گیری شده آزاده شده با میزان معادل جدا شده و یا جبران شده و یا خریدن میزان کافی اعتبار کربن برای جبران تفاوت موجوداست.

Letzte Aktualisierung: 2016-03-03
Nutzungshäufigkeit: 1
Qualität:

Referenz: Wikipedia

Englisch

and this skill might just be one of balancing between iran’s ayatollah ali khamenei and the united states, as we saw during rouhani’s recent trip to new york.

Persisch

به عنوان مثال روحانی در مصاحبه با کریستیان امانپور گفت در ایران خبرنگاری به دلیل فعالیت‌های رسانه ای دستگیر نمی‌شود، یا در بنیاد آمریکای نوین حاضر نشد تایید کند ۹۱ ضربه شلاق و شش ماه تا یک سال حبس تعلیقی برای سازندگان ویدیوی «هپی» در ایران حکم سنگینی است.

Letzte Aktualisierung: 2016-02-24
Nutzungshäufigkeit: 1
Qualität:

Referenz: Wikipedia

Englisch

due to the growing awareness of the relevance of this unbridled expansion with increasing concerns about social justice, economic vitality and environmental durability, the concept of sustainable development has emerged in the path of balancing between these dimensions and has become an important and pivotal perspective in various disciplines.

Persisch

به دلیل آگاهی روزافزون از مرتبط بودن این گسترش افسارگسیخته با افزایش نگرانی ها بسیار در مورد عدالت اجتماعی، نشاط اقتصادی و دوام زیست محیطی، مفهوم توسعه پایدار در مسیر ایجاد تعادل در میان این ابعاد ظهور نموده است و به چشم انداز مهم و محوری در رشته های مختلف تبدیل شده است

Letzte Aktualisierung: 2023-11-26
Nutzungshäufigkeit: 1
Qualität:

Referenz: Wikipedia

Englisch

a taxonomy of job scheduling on distributed computing systems raquel v. lopes, member, ieee, and daniel menasce´, fellow, ieee abstract—hundreds of papers on job scheduling for distributed systems are published every year and it becomes increasingly difficult to classify them. our analysis revealed that half of these papers are barely cited. this paper presents a general taxonomy for scheduling problems and solutions in distributed systems. this taxonomy was used to classify and make publicly available the classification of 109 scheduling problems and their solutions. these 109 problems were further clustered into ten groups based on the features of the taxonomy. the proposed taxonomy will facilitate researchers to build on prior art, increase new research visibility, and minimize redundant effort. index terms—taxonomy, scheduling, distributed jobs, cluster, grid computing, cloud computing. ◆ 1 introduction n the last decade, cluster computing emerged as the main platform for high performance, grid, and cloud computing. together, these three different, yet very simi- lar platforms, emerged as important sources of computing power. they all consist of distributed computers (or nodes) connected through high speed networks. most of the scheduling problems are computationally hard [1], [2], [3], and they have been attracting the attention of researchers for decades. thousands of solutions have been published, dealing with slightly different versions of a scheduling problem. indeed, there are many knobs that may be tuned in order to clearly specify a scheduling problem of this nature. to the best of our knowledge, these knobs have not been defined for general scheduling problems, leading an important researcher to clamor for the need of a proper definition of scheduling problems: at the very minimum, we wish that all papers about job schedulers, either real or paper design, make clear their assumptions about the workload, the permissible actions allowed by the system, and the metric that is being optimized. [4] twenty years later, the situation has not improved. so far, the many knobs needed to define a scheduling problem have been tuned on an ad hoc individual basis. it is time for change. while hundreds of papers on scheduling are published every year, it becomes increasingly difficult to easily identify scheduling problems and solutions. we are not aware of any general taxonomy to define job scheduling problems and solutions in distributed systems. this paper aims at shedding light on this scenario by defining such a r. lopes is with the departmento de sistemas e computac¸a˜o, universi- dade federal de campina grande, paraiba, brazil. e-mail: raquel@dsc.ufcg.edu.br d. menasce´ is with department of computer science, george mason university, fairfax, va 22030. e-mail: menasce@gmu.edu. manuscript received september 00, 2015 taxonomy and classifiying a great deal of papers through the use of this taxonomy. early seminal work aimed at defining taxonomies to classify scheduling problems and solutions exist. an impor- tant work defines a taxonomy for distributed job scheduling solutions [5]. another inspiring work defines a language to specify scheduling problems [6]. in spite of the inspiring nature of these seminal propositions, a general taxonomy that takes into account the new generation of distributed systems and scheduling problems and solutions is required. more recently, some researchers have defined tax- onomies for specific types of distributed platforms. how- ever, none try to cover a distributed system in general, as we argue is the most appropriate solution. the authors of [7] define a taxonomy of scheduling problems in grid computing platforms. smanchat and viriyapant [8] extend the grid taxonomy to define a taxonomy of scheduling problems in cloud computing. these taxonomies overlap in some aspects, especially those describing workload and solution, and at the same time, they are over-fitting models, not general to be applied to any kind of distributed platform known today. they consider properties that represent very specific details of each resource platform. for example, the grid taxonomy [7] only considers scheduling problems that target multi-criteria decision analysis involving cost. this excludes many scheduling problems in which cost is not considered or in which the scheduling goal considers one criterion, like minimization of makespan, that is historically the most popular scheduling goal. some properties are highly coupled with grid environments such as the cost model flexibility, and intra and interdependence among scheduling criteria. the taxonomies of workflow scheduling techniques in the cloud assume that resources are virtual machines, which is not true for all distributed platforms, even for the cloud1. some properties of the cloud taxonomy 1. metal as a service has recently arisen as a new model in which the cloud user deploys directly onto bare metal for optimum per- formance. openstack, for instance, is considering this new model (https://wiki.openstack.org/wiki/ironic). are highly coupled with traditional cloud environments, such as vm startup latency and provisioning model (on- demand, reservation or spot). it is also important to point out that the taxonomies mentioned above fail to consider some properties that are important to clearly define scheduling problems and solu- tions. for instance, they do not define workload compo- sition in a complete fashion, neither resource sharing or scaling. they also do not consider important requirements such as data locality and failure model. finally, they do not include properties that characterize the quality of service required by the workload. we argue that these and other features must be considered. we conclude that prior work in scheduling taxonomies is not generic or complete enough for classifying scheduling problems and solutions in distributed platforms. they either focus on specific resource categories and not distributed resources in general. we argue that a unified taxonomy is possible and, in fact, needed, in opposition to many specific overlapping taxonomies for each type of distributed platform. moreover, hybrid infrastructures are increasingly common, in which different cloud or grid computing infras- tructures inter-operate [9], [10]; cases that can be modeled by a unified taxonomy. finally and most importantly, it is easier to maintain a single taxonomy over the years than to maintain many different, overlapping ones. for these reasons, we have defined our own taxonomy to classify existing (and future) scheduling problems and solutions. the taxonomy targets the scheduling of jobs in distributed systems. the solution is clearly meaningless without the associated problem. the problem, however, can be useful alone for comparison reasons. so, we organize the taxonomy in such a way that the problem and the solution can be easily separated. we propose the use of the taxonomy to (i) instantiate different scheduling problems and (ii) classify different scheduling solutions. the contributions of this paper are four-fold. first, a comprehensive taxonomy for classifying scheduling prob- lems and solutions is defined. this taxonomy allows a researcher to define what is claimed, i.e., which portion of the scheduling problem space is being addressed and to define the properties of the scheduling solution in a com- prehensible fashion. this taxonomy provides a snapshot of the state-of-the-art of job scheduling in distributed systems. second, we perform an analysis of the impact of a subset of 1058 papers related to job scheduling in distributed systems from 2005 to 2015 (may, 1st). third, we apply the taxonomy to classify 109 scheduling problems and solutions published in the top-102 papers in the area, considering the number of citations per year. finally, we publish an online scheduling archive, collaboratively constructed, in which classified scheduling problems and solutions may be found and others may be added. we found that almost 22% of the papers related to job scheduling in distributed systems are never cited; 12% of the papers in the area are responsible for 66% of all citations, and 40% of the papers are cited at most twice in their entire life. this is a sad indication that we are still crawling towards a real scientific methodology. we hope that by classifying the papers using a well-known taxonomy, researchers will be able to clearly indicate what kinds of problems and solutions they are claiming. as a consequence, the classification will allow new research to be built on top of the prior art and it will be easier to know the state-of-the- art regarding specific instantiations of scheduling problems. richard hamming detected a central problem of com- puter science during his turning award lecture: perhaps the central problem we face in all of com- puter science is how we are to get to the situation where we build on top of the work of others rather than redoing so much of it in a trivially different way. science is supposed to be cumulative, not almost endless duplication of the same kind of things. [11] we believe that building an adequate taxonomy consti- tutes a first step towards the direction pointed by ham- ming. without proper mechanisms to classify work we are doomed to ignore what others have done. other steps are still necessary. in particular, the discipline to use the taxonomy from now on and to maintain it up-to-date. an important action in this regard is to maintain an archive of scheduling problems and solutions based on the taxonomies. for that purpose, we created a web site, the dss archive (distributed systems scheduling)2. we initially populated the site with the classification of 109 problems and their solutions. the idea is to collaboratively increase the number of papers cataloged. the site offers a form to fa- cilitate the inclusion of new scheduling problems/solutions in the archive. researchers can download the data set with all the problems and solutions classified so far and then ma- nipulate the data using their statistical tools of preference3. the rest of this paper is organized as follows. section 2 presents a background on scheduling theory and defines a scheduling problem. section 3 introduces a taxonomy for scheduling in distributed systems that contemplates problems and solutions. section 4 summarizes the research method and underlying review protocol, which was used to collect 1058 papers published in the last decade on job scheduling in distributed systems. the next section presents statistics about these papers including popularity and re- source categories considered. the taxonomy was used to classify 109 scheduling problems and respective solutions. the results are summarized in section 6. related work is discussed in section 7. section 8 concludes with recommen- dations for future research on the topic. 2 background on scheduling theory this section provides a conceptual model of scheduling problems and solutions in distributed computer systems. some definitions in this section are based on previous work [2], [12]. we do not consider in this paper single- node scheduling problems, which have been thoroughly investigated in the field of operating systems. scheduling is the assignment of resources to consumers in time. in general, every instance of a scheduling problem must clearly specify three components: • workload, defines the consumers of the resources. in the context of this paper a workload is composed of 2. http://lsd.ufcg.edu.br/˜dssarchive 3. we provide r scripts to facilitate data manipulation. jobs, defined as a collection of computational tasks. thus, a job j has nj tasks tj, . . . , tj . 1 nj • resources, required to execute the workload, consist of a set of distributed nodes or computers, with one or more processing cores, connected by a, typically high- speed, network. these resources may be organized in computing clusters in a local environment or in widely distributed and scalable data centers [13]. resources are assumed to be able to execute any type of computational task and consist of whole computing units, with main memory, storage devices and network access. we assume that nodes can only communicate through message exchange. • scheduling requirements determine the scheduling goal and other requirements that must be met by the solution. typically, the scheduling goal is to optimize one or a combination of performance metrics affected by scheduling decisions. another important schedul- ing requirement is the scheduling level. it determines the granularity or the level of detail considered when making a scheduling decision. we consider two lev- els of scheduling decisions: job and task4. scheduling is typically a dynamic activity: workload and resources may vary over time. in order to model these dy- namic aspects, we consider r+ to denote the set of time instants of interest, which may be discrete or continuous. at any time t the workload is composed by a set t of jobs. at any time t the resources consist of a set t of resources. nevertheless, there are static properties of the workload and/or resources that do not change over time and are the core of our taxonomy. let and represent the static aspects of the workload and resources respectively. let be the set of scheduling requirements that must be satisfied. we define a scheduling problem as a tuple ( , , ). a scheduling solution is associated with a given scheduling problem. there may be more than one solution to the same problem. 3 scheduling taxonomy in distributed sys- tems the proposed taxonomy is organized into two parts: one characterizes a scheduling problem and another a schedul- ing solution. the problem part (see figure 1) consists of 17 static features that fall into three groups: workload (w), resources (r), and requirements (q). 3.1 workload description seven features characterize the workload . 1 - job source. defines if jobs come from multiple users single user and if the workload consists of multiple-jobs or a single-job. reasonable combinations are: single user/single- job, single-user/multi-job and multi-user/multi-job. when the workload comes from many users, scheduling is often per- formed from the provider standpoint. 4. each task consists of one or more (lightweight) processes that must be scheduled at the computing node assigned to run the task. this constitutes a third level of scheduling, i.e., process-level, typically managed by the operating system. this level of scheduling is outside the scope of this paper. fig. 1. summary of static features related to a scheduling problem. 2 - job structure. defines the allowed number of tasks per job and the dependency relations and communication needs among the tasks. first, this feature defines if jobs are multi- or single-task. for multi-task jobs, one has to determine the task homogeneity. tasks are homogeneous when they require similar resource demands and are hetero- geneous otherwise. the tasks of a job may have precedence constraints and communication needs to be satisfied, in which case they are dependent. dependency between tasks often brings to the scheduling problem the challenge of data locality, since data transfers come at a cost. when there are neither precedence relations among the tasks nor communication needs, tasks are independent. based on this discussion, the job structure may be: single-task, independent homogeneous multi-task, independent heterogeneous multi-task, dependent homogeneous multi-task or dependent heterogeneous multi-task. the trivial case of a single-job and single-task workload is not interesting and is not considered here. 3 - job flexibility. rigid jobs require a fixed quantity of resources and cannot execute on fewer or more resources. this quantity is defined by the user at job submission time. other classes of jobs exist [4]: moldable, malleable and evolving. when a moldable job is submitted, some entity, possibly a scheduler, decides on the quantity of resources to provide the job. this quantity cannot be reconfigured during the job execution. malleable jobs are moldable jobs whose computing requirements can change during execution by the scheduler or other system entity. finally, evolving jobs are similar to malleable jobs, but the user decides, on the fly, about the quantity of resources to assign to the job. 4 - arrival process. determines the set of jobs consid- ered by the scheduler when making scheduling decisions. in an open workload model, jobs come to the system at any time and leave the system after being executed, i.e., the number of jobs in the system is not constant. in a closed workload, the number of jobs to be scheduled is fixed. 5 - workload composition. this feature is determined by the programming model, which drives the kinds of relation- ships that must hold between the tasks of a job. some exam- ples include bags of tasks, in which all tasks are independent from one another, and mapreduce jobs, in which all map tasks must finish before the reduce tasks start execution. a workload may be formed by jobs that follow the same programming model or may be heterogeneous. a workload that consists of jobs of the same programming model may be classified as: same model/homogeneous, when jobs are similar in terms of structure, number of tasks and in terms of demands required; same model/same structure, when jobs are similar in terms of structure, number of tasks but differ in terms of demands required; or same model/diverse, when jobs use the same programming model but have different struc- ture, number of tasks, and resource demands. dependence relations and communication patterns do not exist if jobs are single-task. as a consequence, when the workload consists of multiple single-task jobs, the workload composition must be same model/homogeneous or same model/same structure. 6 - quality of service. jobs may be associated to service level agreements (slas). penalties may be imposed when slas are violated. these jobs are slo aware, since they require service level objectives (slos) to be met. jobs that are not associated to slas are considered best effort jobs. 7 - real time. the workload may consist of real time jobs or non real time jobs. for the former case, we distinguish between real time jobs with hard deadlines and soft deadlines. we also consider whether tasks are periodic or aperiodic. a hard or soft real time workload is necessarily slo aware. 3.2 resource description we identified five features that characterize the resources. 1 - resource heterogeneity. homogeneous resource plat- forms consist of similar nodes in terms of processing power, storage, and networking capabilities. heterogeneous resource platforms consist of nodes with different computing powers, in terms of processing, storage, or communication speeds. 2 - resource scaling. the scheduler can see the re- sources it can use as a fixed or dynamic infrastructure in terms of processing capacity. some infrastructures allow rapid capacity changes in response to variations in the work- load. the total capacity of a fixed-capacity resource platform does not vary in the short term. on the other hand, some distributed systems allow dynamic scaling. three common situations lead to dynamically scalable infrastructures: (i) shutdown resources, when some nodes are turned off to save energy, temporarily reducing the online capacity of the infrastructure. the total capacity is rapidly restored by turning on the machines; (ii) outsourcing, when it is possible to rapidly acquire resources from other resource providers, such as infrastructure as a service (iaas) providers or grid peers; (iii) dvfs, when dynamic voltage and frequen

Persisch

a taxonomy of job scheduling on distributed computing systems raquel v. lopes, member, ieee, and daniel menasce´, fellow, ieee abstract—hundreds of papers on job scheduling for distributed systems are published every year and it becomes increasingly difficult to classify them. our analysis revealed that half of these papers are barely cited. this paper presents a general taxonomy for scheduling problems and solutions in distributed systems. this taxonomy was used to classify and make publicly available the classification of 109 scheduling problems and their solutions. these 109 problems were further clustered into ten groups based on the features of the taxonomy. the proposed taxonomy will facilitate researchers to build on prior art, increase new research visibility, and minimize redundant effort. index terms—taxonomy, scheduling, distributed jobs, cluster, grid computing, cloud computing. ◆ 1 introduction n the last decade, cluster computing emerged as the main platform for high performance, grid, and cloud computing. together, these three different, yet very simi- lar platforms, emerged as important sources of computing power. they all consist of distributed computers (or nodes) connected through high speed networks. most of the scheduling problems are computationally hard [1], [2], [3], and they have been attracting the attention of researchers for decades. thousands of solutions have been published, dealing with slightly different versions of a scheduling problem. indeed, there are many knobs that may be tuned in order to clearly specify a scheduling problem of this nature. to the best of our knowledge, these knobs have not been defined for general scheduling problems, leading an important researcher to clamor for the need of a proper definition of scheduling problems: at the very minimum, we wish that all papers about job schedulers, either real or paper design, make clear their assumptions about the workload, the permissible actions allowed by the system, and the metric that is being optimized. [4] twenty years later, the situation has not improved. so far, the many knobs needed to define a scheduling problem have been tuned on an ad hoc individual basis. it is time for change. while hundreds of papers on scheduling are published every year, it becomes increasingly difficult to easily identify scheduling problems and solutions. we are not aware of any general taxonomy to define job scheduling problems and solutions in distributed systems. this paper aims at shedding light on this scenario by defining such a r. lopes is with the departmento de sistemas e computac¸a˜o, universi- dade federal de campina grande, paraiba, brazil. e-mail: raquel@dsc.ufcg.edu.br d. menasce´ is with department of computer science, george mason university, fairfax, va 22030. e-mail: menasce@gmu.edu. manuscript received september 00, 2015 taxonomy and classifiying a great deal of papers through the use of this taxonomy. early seminal work aimed at defining taxonomies to classify scheduling problems and solutions exist. an impor- tant work defines a taxonomy for distributed job scheduling solutions [5]. another inspiring work defines a language to specify scheduling problems [6]. in spite of the inspiring nature of these seminal propositions, a general taxonomy that takes into account the new generation of distributed systems and scheduling problems and solutions is required. more recently, some researchers have defined tax- onomies for specific types of distributed platforms. how- ever, none try to cover a distributed system in general, as we argue is the most appropriate solution. the authors of [7] define a taxonomy of scheduling problems in grid computing platforms. smanchat and viriyapant [8] extend the grid taxonomy to define a taxonomy of scheduling problems in cloud computing. these taxonomies overlap in some aspects, especially those describing workload and solution, and at the same time, they are over-fitting models, not general to be applied to any kind of distributed platform known today. they consider properties that represent very specific details of each resource platform. for example, the grid taxonomy [7] only considers scheduling problems that target multi-criteria decision analysis involving cost. this excludes many scheduling problems in which cost is not considered or in which the scheduling goal considers one criterion, like minimization of makespan, that is historically the most popular scheduling goal. some properties are highly coupled with grid environments such as the cost model flexibility, and intra and interdependence among scheduling criteria. the taxonomies of workflow scheduling techniques in the cloud assume that resources are virtual machines, which is not true for all distributed platforms, even for the cloud1. some properties of the cloud taxonomy 1. metal as a service has recently arisen as a new model in which the cloud user deploys directly onto bare metal for optimum per- formance. openstack, for instance, is considering this new model (https://wiki.openstack.org/wiki/ironic). are highly coupled with traditional cloud environments, such as vm startup latency and provisioning model (on- demand, reservation or spot). it is also important to point out that the taxonomies mentioned above fail to consider some properties that are important to clearly define scheduling problems and solu- tions. for instance, they do not define workload compo- sition in a complete fashion, neither resource sharing or scaling. they also do not consider important requirements such as data locality and failure model. finally, they do not include properties that characterize the quality of service required by the workload. we argue that these and other features must be considered. we conclude that prior work in scheduling taxonomies is not generic or complete enough for classifying scheduling problems and solutions in distributed platforms. they either focus on specific resource categories and not distributed resources in general. we argue that a unified taxonomy is possible and, in fact, needed, in opposition to many specific overlapping taxonomies for each type of distributed platform. moreover, hybrid infrastructures are increasingly common, in which different cloud or grid computing infras- tructures inter-operate [9], [10]; cases that can be modeled by a unified taxonomy. finally and most importantly, it is easier to maintain a single taxonomy over the years than to maintain many different, overlapping ones. for these reasons, we have defined our own taxonomy to classify existing (and future) scheduling problems and solutions. the taxonomy targets the scheduling of jobs in distributed systems. the solution is clearly meaningless without the associated problem. the problem, however, can be useful alone for comparison reasons. so, we organize the taxonomy in such a way that the problem and the solution can be easily separated. we propose the use of the taxonomy to (i) instantiate different scheduling problems and (ii) classify different scheduling solutions. the contributions of this paper are four-fold. first, a comprehensive taxonomy for classifying scheduling prob- lems and solutions is defined. this taxonomy allows a researcher to define what is claimed, i.e., which portion of the scheduling problem space is being addressed and to define the properties of the scheduling solution in a com- prehensible fashion. this taxonomy provides a snapshot of the state-of-the-art of job scheduling in distributed systems. second, we perform an analysis of the impact of a subset of 1058 papers related to job scheduling in distributed systems from 2005 to 2015 (may, 1st). third, we apply the taxonomy to classify 109 scheduling problems and solutions published in the top-102 papers in the area, considering the number of citations per year. finally, we publish an online scheduling archive, collaboratively constructed, in which classified scheduling problems and solutions may be found and others may be added. we found that almost 22% of the papers related to job scheduling in distributed systems are never cited; 12% of the papers in the area are responsible for 66% of all citations, and 40% of the papers are cited at most twice in their entire life. this is a sad indication that we are still crawling towards a real scientific methodology. we hope that by classifying the papers using a well-known taxonomy, researchers will be able to clearly indicate what kinds of problems and solutions they are claiming. as a consequence, the classification will allow new research to be built on top of the prior art and it will be easier to know the state-of-the- art regarding specific instantiations of scheduling problems. richard hamming detected a central problem of com- puter science during his turning award lecture: perhaps the central problem we face in all of com- puter science is how we are to get to the situation where we build on top of the work of others rather than redoing so much of it in a trivially different way. science is supposed to be cumulative, not almost endless duplication of the same kind of things. [11] we believe that building an adequate taxonomy consti- tutes a first step towards the direction pointed by ham- ming. without proper mechanisms to classify work we are doomed to ignore what others have done. other steps are still necessary. in particular, the discipline to use the taxonomy from now on and to maintain it up-to-date. an important action in this regard is to maintain an archive of scheduling problems and solutions based on the taxonomies. for that purpose, we created a web site, the dss archive (distributed systems scheduling)2. we initially populated the site with the classification of 109 problems and their solutions. the idea is to collaboratively increase the number of papers cataloged. the site offers a form to fa- cilitate the inclusion of new scheduling problems/solutions in the archive. researchers can download the data set with all the problems and solutions classified so far and then ma- nipulate the data using their statistical tools of preference3. the rest of this paper is organized as follows. section 2 presents a background on scheduling theory and defines a scheduling problem. section 3 introduces a taxonomy for scheduling in distributed systems that contemplates problems and solutions. section 4 summarizes the research method and underlying review protocol, which was used to collect 1058 papers published in the last decade on job scheduling in distributed systems. the next section presents statistics about these papers including popularity and re- source categories considered. the taxonomy was used to classify 109 scheduling problems and respective solutions. the results are summarized in section 6. related work is discussed in section 7. section 8 concludes with recommen- dations for future research on the topic. 2 background on scheduling theory this section provides a conceptual model of scheduling problems and solutions in distributed computer systems. some definitions in this section are based on previous work [2], [12]. we do not consider in this paper single- node scheduling problems, which have been thoroughly investigated in the field of operating systems. scheduling is the assignment of resources to consumers in time. in general, every instance of a scheduling problem must clearly specify three components: • workload, defines the consumers of the resources. in the context of this paper a workload is composed of 2. http://lsd.ufcg.edu.br/˜dssarchive 3. we provide r scripts to facilitate data manipulation. jobs, defined as a collection of computational tasks. thus, a job j has nj tasks tj, . . . , tj . 1 nj • resources, required to execute the workload, consist of a set of distributed nodes or computers, with one or more processing cores, connected by a, typically high- speed, network. these resources may be organized in computing clusters in a local environment or in widely distributed and scalable data centers [13]. resources are assumed to be able to execute any type of computational task and consist of whole computing units, with main memory, storage devices and network access. we assume that nodes can only communicate through message exchange. • scheduling requirements determine the scheduling goal and other requirements that must be met by the solution. typically, the scheduling goal is to optimize one or a combination of performance metrics affected by scheduling decisions. another important schedul- ing requirement is the scheduling level. it determines the granularity or the level of detail considered when making a scheduling decision. we consider two lev- els of scheduling decisions: job and task4. scheduling is typically a dynamic activity: workload and resources may vary over time. in order to model these dy- namic aspects, we consider r+ to denote the set of time instants of interest, which may be discrete or continuous. at any time t the workload is composed by a set t of jobs. at any time t the resources consist of a set t of resources. nevertheless, there are static properties of the workload and/or resources that do not change over time and are the core of our taxonomy. let and represent the static aspects of the workload and resources respectively. let be the set of scheduling requirements that must be satisfied. we define a scheduling problem as a tuple ( , , ). a scheduling solution is associated with a given scheduling problem. there may be more than one solution to the same problem. 3 scheduling taxonomy in distributed sys- tems the proposed taxonomy is organized into two parts: one characterizes a scheduling problem and another a schedul- ing solution. the problem part (see figure 1) consists of 17 static features that fall into three groups: workload (w), resources (r), and requirements (q). 3.1 workload description seven features characterize the workload . 1 - job source. defines if jobs come from multiple users single user and if the workload consists of multiple-jobs or a single-job. reasonable combinations are: single user/single- job, single-user/multi-job and multi-user/multi-job. when the workload comes from many users, scheduling is often per- formed from the provider standpoint. 4. each task consists of one or more (lightweight) processes that must be scheduled at the computing node assigned to run the task. this constitutes a third level of scheduling, i.e., process-level, typically managed by the operating system. this level of scheduling is outside the scope of this paper. fig. 1. summary of static features related to a scheduling problem. 2 - job structure. defines the allowed number of tasks per job and the dependency relations and communication needs among the tasks. first, this feature defines if jobs are multi- or single-task. for multi-task jobs, one has to determine the task homogeneity. tasks are homogeneous when they require similar resource demands and are hetero- geneous otherwise. the tasks of a job may have precedence constraints and communication needs to be satisfied, in which case they are dependent. dependency between tasks often brings to the scheduling problem the challenge of data locality, since data transfers come at a cost. when there are neither precedence relations among the tasks nor communication needs, tasks are independent. based on this discussion, the job structure may be: single-task, independent homogeneous multi-task, independent heterogeneous multi-task, dependent homogeneous multi-task or dependent heterogeneous multi-task. the trivial case of a single-job and single-task workload is not interesting and is not considered here. 3 - job flexibility. rigid jobs require a fixed quantity of resources and cannot execute on fewer or more resources. this quantity is defined by the user at job submission time. other classes of jobs exist [4]: moldable, malleable and evolving. when a moldable job is submitted, some entity, possibly a scheduler, decides on the quantity of resources to provide the job. this quantity cannot be reconfigured during the job execution. malleable jobs are moldable jobs whose computing requirements can change during execution by the scheduler or other system entity. finally, evolving jobs are similar to malleable jobs, but the user decides, on the fly, about the quantity of resources to assign to the job. 4 - arrival process. determines the set of jobs consid- ered by the scheduler when making scheduling decisions. in an open workload model, jobs come to the system at any time and leave the system after being executed, i.e., the number of jobs in the system is not constant. in a closed workload, the number of jobs to be scheduled is fixed. 5 - workload composition. this feature is determined by the programming model, which drives the kinds of relation- ships that must hold between the tasks of a job. some exam- ples include bags of tasks, in which all tasks are independent from one another, and mapreduce jobs, in which all map tasks must finish before the reduce tasks start execution. a workload may be formed by jobs that follow the same programming model or may be heterogeneous. a workload that consists of jobs of the same programming model may be classified as: same model/homogeneous, when jobs are similar in terms of structure, number of tasks and in terms of demands required; same model/same structure, when jobs are similar in terms of structure, number of tasks but differ in terms of demands required; or same model/diverse, when jobs use the same programming model but have different struc- ture, number of tasks, and resource demands. dependence relations and communication patterns do not exist if jobs are single-task. as a consequence, when the workload consists of multiple single-task jobs, the workload composition must be same model/homogeneous or same model/same structure. 6 - quality of service. jobs may be associated to service level agreements (slas). penalties may be imposed when slas are violated. these jobs are slo aware, since they require service level objectives (slos) to be met. jobs that are not associated to slas are considered best effort jobs. 7 - real time. the workload may consist of real time jobs or non real time jobs. for the former case, we distinguish between real time jobs with hard deadlines and soft deadlines. we also consider whether tasks are periodic or aperiodic. a hard or soft real time workload is necessarily slo aware. 3.2 resource description we identified five features that characterize the resources. 1 - resource heterogeneity. homogeneous resource plat- forms consist of similar nodes in terms of processing power, storage, and networking capabilities. heterogeneous resource platforms consist of nodes with different computing powers, in terms of processing, storage, or communication speeds. 2 - resource scaling. the scheduler can see the re- sources it can use as a fixed or dynamic infrastructure in terms of processing capacity. some infrastructures allow rapid capacity changes in response to variations in the work- load. the total capacity of a fixed-capacity resource platform does not vary in the short term. on the other hand, some distributed systems allow dynamic scaling. three common situations lead to dynamically scalable infrastructures: (i) shutdown resources, when some nodes are turned off to save energy, temporarily reducing the online capacity of the infrastructure. the total capacity is rapidly restored by turning on the machines; (ii) outsourcing, when it is possible to rapidly acquire resources from other resource providers, such as infrastructure as a service (iaas) providers or grid peers; (iii) dvfs, when dynamic voltage and frequen

Letzte Aktualisierung: 2019-05-14
Nutzungshäufigkeit: 1
Qualität:

Referenz: Anonym

Eine bessere Übersetzung mit
7,740,017,399 menschlichen Beiträgen

Benutzer bitten jetzt um Hilfe:



Wir verwenden Cookies zur Verbesserung Ihrer Erfahrung. Wenn Sie den Besuch dieser Website fortsetzen, erklären Sie sich mit der Verwendung von Cookies einverstanden. Erfahren Sie mehr. OK